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PCMV-NS34A 



1 


TCGCGCGTTT 
AGCGCGCAAA 


CGGTGATGAC 
GCCACTACTG 


GGTGAAAACC 
CCACTTTTGG 


TCTGACACAT 
AGACTGTGTA 


GCAGCTCCCG 
CGTCGAGGGC 


51 


GAGACGGTCA 
CTCTGCCAGT 


CAGCTTGTCT 
GTCGAACAGA 


GTAAGCGGAT 
CATTCGCCTA 


GCCGGGAGCA 
CGGCCCTCGT 


GACAAGCCCG 
CTGTTCGGGC 


101 


TCAGGGCGCG 
AGTCCCGCGC 


TCAGCGGGTG 
AGTCGCCCAC 


TTGGCGGGTG 
AACCGCCCAC 


TCGGGGCTGG 
AGCCCCGACC 


CTTAACTATG 
GAATTGATAC 


151 


CGGCATCAGA 
GCCGTAGTGT 


GCAGATTGTA 
CGTCTAACAT 


CTGAGAGTGC 
GACTCTCACG 


ACCATATGAA 
TGGTATACTT 


GCTTTTTGCA 
CGAAAAACGT 




StuI 








201 


AAAGCCTAGG 
TTTCGGATCC 


CCTCCAAAAA 
GGAGGTTTTT 


AGCCTCCTCA 
TCGGAGGAGT 


CTACTTCTGG 
GATGAAGACC 


AATAGCTCAG 
TTATCGAGTC 


251 


AGGCCGAGGC 
TCCGGCTCCG 


GGCCTCGGCC 
CCGGAGCCGG 


TCTGCATAAA 
AGACGTATTT 


TAAAAAAAAT 
ATTTTTTTTA 


TAGTCAGCCA 
ATCAGTCGGT 


301 


TGGGGCGGAG 
ACCCCGCCTC 


AATGGGCGGA 
TTACCCGCCT 


ACTGGGCGGG 
TGACCCGCCC 


GAGGGAATTA 
CTCCCTTAAT 


TTGGCTATTG 
AACCGATAAC 


351 


GCCATTGCAT 
CGGTAACGTA 


ACGTTGTATC 
TGCAACATAG 


TATATCATAA 
ATATAGTATT 


TATGTACATT 
ATACATGTAA 


TATATTGGCT 
ATATAACCGA 


401 


CATGTCCAAT 
GTACAGGTTA 


ATGACCGCCA 
TACTGGCGGT 


TGTTGACATT 
ACAACTGTAA 


GAT T ATTGAC 
CTAATAACTG 


TAGTTATTAA 
ATCAATAATT 


451 


TAGTAATCAA 
ATCATTAGTT 


TTACGGGGTC 
AATGCCCCAG 


ATTAGTTCAT 
TAATCAAGTA 


AGCCCATATA 
TCGGGTATAT 


TGGAGTTCCG 
ACCTCAAGGC 


501 


CGTTACATAA 
GCAATGTATT 


CTTACGGTAA 
GAATGCCATT 


ATGGCCCGCC 
TACCGGGCGG 


TGGCTGACCG 
ACCGACTGGC 


CCCAACGACC 
GGGTTGCTGG 


551 


CCCGCCCATT 
GGGCGGGTAA 


GACGTCAATA 
CTGCAGTTAT 


ATGACGTATG 
TACTGCATAC 


TTCCCATAGT 
AAGGGTATCA 


AACGCCAATA 
TTGCGGTTAT 


601 


GGGACTTTCC 
CCCTGAAAGG 


ATTGACGTCA 
TAACTGCAGT 


ATGGGTGGAG 
TACCCACCTC 


TATTTACGGT 
ATAAATGCCA 


AAACTGCCCA 
TTTGACGGGT 


651 


CTTGGCAGTA 
GAACCGTCAT 


CATCAAGTGT 
GTAGTTCACA 


ATCATATGCC 
TAGTATACGG 


AAGTCCGCCC 
TTCAGGCGGG 


CCTATTGACG 
GGATAACTGC 


701 


TCAATGACGG 
AGTTACTGCC 


TAAATGGCCC 
ATTTACCGGG 


GCCTGGCATT 
CGGACCGTAA 


ATGCCCAGTA 
TACGGGTCAT 


CATGACCTTA 
GTACTGGAAT 


751 


CGGGACTTTC 
GCCCTGAAAG 


CTACTTGGCA 
GATGAACCGT 


GTACATCTAC 
CATGTAGATG 


GTATTAGTCA 
CATAATCAGT 


TCGCTATTAC 
AGCGATAATG 


801 


CATGGTGATG 
GTACCACTAC 


CGGTTTTGGC 
GCCAAAACCG 


AGTACACCAA 
TCATGTGGTT 


TGGGCGTGGA 
ACCCGCACCT 


TAGCGGTTTG 
ATCGCCAAAC 


851 


ACTCACGGGG 
TGAGTGCCCC 


ATTTCCAAGT 
TAAAGGTTCA 


CTCCACCCCA 
GAGGTGGGGT 


TTGACGTCAA 
AACTGCAGTT 


TGGGAGTTTG 
ACCCTCAAAC 
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901 


TTTTGGCACC 
AAAACCGTGG 


AAAATCAACG 

TTTi AvTTviC 


GGACTTTCCA 

CvTuAAAuul 


AAATGTCGTA 

X X InUiuUii 


ATAACCCCGC 
TATTGGGGCG 


951 


CCCGTTGACG 
GGGCAACTGC 


CAAATGGGCG 
GTTTACCCGC 


GTAGGCGTGT 

UAX LtuCALA 


ACGGTGGGAG 
J. utwi^ x w 


GTCTATATAA 
CAGATATATT 


1001 


GCAGAGCTCG 
CGTCTCGAGC 


TTTAGTGAAC 
AAATCACTTG 


CGTCAGATCG 
GCAGTCTAGC 


CCTGGAGACG 


CCATCCACGC 


1051 


TGTTTTGACC 
ACAAAACTG G 


TCCATAGAAG 
AGGTATCTTC 


ACACCGGGAC 
TGTGGCCCTG 


CGATCCAGCC 
ul> 1 Aw i 


TCCGCGGCCG 
AGGCGCCGGC 


1101 


GGAACGGTGC 
CCTTGCCACG 


ATTGGAACGC 
TAACCTTGCG 


GGATTCCCCG 
CCTAAGGGGC 


TGCCAAGAGT 


GACGTAAGTA 
CTGCATTCAT 


1151 


CCGCCTATAG 
GGCGGATATC 


ACTCTATAGG 
TGAGATATCC 


CACACCCCTT 


TGGCTCTTAT 
ACCGAGAATA 


GCATGCTATA 
CGTACGATAT 


1201 


CTGTTTTTGG 
GACAAAAACC 


CTTGGGGCCT 
GAACCCCGGA 


ATACACCCCC 
TATGTGGGGG 


GCTCCTTATG 
CGAGGAATAC 


CTATAGGTGA 

un X A X ^Wiv« X 


1251 


TGGTATAGCT 
ACCATATCGA 


TAGCCTATAG 
AT CGG AT ATC 


GTGTGGGT T A 
CACACCCAAT 


TTGACCATTA 


TTGACCACTC 
AACTGGTGAG 


1301 


CCCTATTGGT 
GGGATAACCA 


GACGATACTT 
CT GCTAT G AA 


TCCATTACTA 
AGGTAATGAT 


ATCCATAACA 


TGGCTCTTTG 


13S1 


CCACAACTAT 
GGTGTTGATA 


CTCTATTGGC 
GAGATAACCG 


TATATGCCAA 
ATATACGGTT 


TACTCTGTCC 


TTCAGAGACT 
AAGTCTCTGA 


1401 


GACACGGACT 
CTGTGCCTGA 


CTGTATTTTT 
GACATAAAAA 


ACAGGATGGG 
TGTCCTACCC 


GTCCATTTAT 

Lnbulnnnln 


TATTTACAAA 
ATAAATGTTT 


1451 


TTCACATATA 
AAGTGTATAT 


CAACAACGCC 
GTTGTTGCGG 


GTCCCCCGTG 


CCCGCAGTTT 


TTATTAAACA 
AATAATTTGT 


1501 


TAGCGTGGGA 
ATCGCACCCT 


TCTCCGACAT 
AGAGGCTGTA 


CTCGGGTACG 


TGTTCCGGAC 


ATGGGCTCTT 
TACCCGAGAA 


1551 


CTCCGGTAGC 
GAGGCCATCG 


GGCGGAGCTT 
CCGCCTCGAA 


CCACATCCGA 
GGTGTAGGCT 


GCCCTGGTCC 


CATCCGTCCA 
GTAGGCAGGT 


1601 


GCGGCTCATG 
CGCCGAGTAC 


GTCGCTCGGC 
CAGCGAGCCG 


AGCTCCTTGC 
TCGAGGAACG 


TCCTAACAGT 
AGGATTGTCA 


GGAGGCCAGA 
CCTCCGGTCT 


1651 


CTTAGGCACA 
GAATCCGTGT 


GCACAATGCC 
CGTGTTACGG 


CACCACCACC 
GTGGTGGTGG 


AGTGTGCCGC 
TCACACGGCG 


ACAAGGCCGT 
TGTTCCGGCA 


1701 


GGCGGTAGGG 
CCGCCATCCC 


TATGTGTCTG 
ATACACAGAC 


AAAATGAGCT 
TTTTACTCGA 


CGGAGATTGG 
GCCTCTAACC 


GCTCGCACCT 
CGAGCGTGGA 


1751 


GGACGCAGAT 
CCTGCGTCTA 


GGAAGACTTA AGGCAGCGGC AGAAGAAGAT GCAGGCAGCT 
CCTTCTGAAT TCCGTCGCCG TCTTCTTCTA CGTCCGTCGA 



1801 GAGTTGTTGT ATTCTGATAA GAGTCAGAGG TAACTCCCGT TGCGGTGCTG 
CTCAACAACA TAAGACTATT CTCAGTCTCC ATTGAGGGCA ACGCCACGAC 
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1851 TTAACGGTGG AGGGCAGTGT AGTCTGAGCA GTACTCGTTG CTGCCGCGCG 
AAT TGCCACC TCCCGTCACA TCAGACTCGT CATGAGCAAC GACGGCGCGC 



1901 CGCCACCAGA CATAATAGCT GACAGACTAA CAGACTGTTC CTTTCCATGG 
GCGGTGGTCT GTATTATCGA CTGTCTGATT GTCTGACAAG GAAAGGTACC 



GTCTTTTCTG CAGTCACCGT CGTCGACCTA AGAATTCACC ATGGCGCCCA 
CAGAAAAGAC GTCAGTGGCA GCAGCTGGAT TCTTAAGTGG TACCGCGGGT 



♦ 2 1 T A Y AQQ TRGL LGC IIT 
2001 TCACGGCGTA CGCCCAGCAG ACAAGGGGCC TCCTAGGGTG GATAATCACC 
AGTGCCGCAT GCGGGTCGTC TGTTCCCCGG AGGATCCCAC GTATTAGTGG 



+ 2SLTG RDK NQV EGEV QIV 
2051 AGCCTAACTG GCCGGGACAA AAACCAAGTG GAGGGTGAGG TCCAGATTGT 
TCGGATTGAC CGGCCCTGTT TTTGGTTCAC CTCCCACTCC AGGTCTAACA 



+ 2 STA AQTF LAT CIN GVC 
2101 GTCAACTGCT GCCCAAACCT TCCTGGCAAC GTGCATCAAT GGGGTGTGCT 
CAGTTGAGGA CGGGTTTGGA AGGACCGTTG CACGTAGTTA CCCCACACGA 



+ 2WTVY HGA GTRT IAS P K G 
2151 GGACTGTCTA CCACGGGGCC GGAACGAGGA CCATCGCGTC ACCCAAGGGT 
CCTGACAGAT GGTGCCCCGG CCTTGCTCCT GGTAGCGCAG TGGGTTCCCA 



-2PVIQ M Y T NVD QDLV GWP 
2201 CCTGTCATCC AGATGTATAC CAATGTAGAC CAAGACCTTG TGGGCTGGCC 
GGACAGTAGG TCTACATATG GTTACATCTG GTTCTGGAAC ACCCGACCGG 



+2 ASQ GTRS LTP CTC GSS 
2251 CGCTTCGCAA GGTACCCGCT CATTGACACC CTGCACTTGC GGCTCCTCGG 
GCGAAGCGTT CCATGGGCGA GTAACTGTGG GACGTGAACG CCGAGGAGCC 



+2DLYL VT R HADV IPV RRR 
2301 ACCTTTACCT GGTCACGAGG CACGCCGATG TCATTCCCGT GCGCCGGCGG 
TGGAAATGGA CCAGTGCTCC GTGCGGCTAC AGTAAGGGCA CGCGGCCGCC 



+2GOSR GSL LSP RPIS YLK 
2351 GGTGATAGCA GGGGCAGCCT GCTGTCGCCC CGGCCCATTT CCTACTTGAA 
CCACTATCGT CCCCGTCGGA CGACAGCGGG GCCGGGTAAA GGATGAACTT 



+ 2 GSS GGPL LCP AGH A V G 
2401 AGGCTCCTCG GGGGGTCCGC TGTTGTGCCC CGCGGGGCAC GCCGTGGGCA 
TCCGAGGAGC CCCCCAGGCG ACAACACGGG GCGCCCCGTG CGGCACCCGT 



♦ 2 1 F R A AVC TRGV A K A VDF 
2451 TATTTAGGGC CGCGGTGTGC ACCCGTGGAG TGGCTAAGGC GGTGGACTTT 
ATAAATCCCG GCGCCACACG TGGGCACCTC ACCGATTCCG CCACCTGAAA 



+2IPVE NLE TTM RSPV FTD 
2501 ATCCCTGTGG AGAACCTAGA GACAACCATG AGGTCCCCGG TGTTCACGGA 
TAGGGACACC TCTTGGATCT CTGTTGGTAC TCCAGGGGCC ACAAGTGCCT 
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+2 NSS PPVV PQS F Q V A H L 
2551 TAACTCCTCT CCACCAGTAG TGCCCCAGAG CTTCCAGGTG GCTCACCTCC 
ATTGAGGAGA GGTGGTCATC ACGGGGTCTC GAAGGTCCAC CGAGTGGAGG 



♦ 2HAPT GSG KSTK VPA A Y A 
2601 ATGCTCCCAC AGGCAGCGGC AAAAGCACCA AGGTCCCGGC TGCATATGCA 
TACGAGGGTG TCCGTCGCCG TTTTCGTGGT TCCAGGGCCG ACGTATACGT 



+2AQGY KVL VLN PSVA ATL 
2651 GCTCAGGGCT ATAAGGTGCT AGTACTCAAC CCCTCTGTTG CTGCAACACT 
CGAGTCCCGA TATTCCACGA TCATGAGTTG GGGAGACAAC GACGTTGTGA 



+2 GFG AYMS KAH G1D PNI 
2701 GGGCTTTGGT GCTTACATGT CCAAGGCTCA TGGGATCGAT CCTAACATCA 
CCCGAAACCA CGAATGTACA GGTTCCGAGT ACCCTAGCTA GGATTGTAGT 



+2RT GV RTI TTGS PIT YST 
2751 GGACCGGGGT GAGAACAATT ACCACTGGCA GCCCCATCAC GTACTCCACC 
CCTGGCCCCA CTCTTGTTAA TGGTGACCGT CGGGGTAGTG CATGAGGTGG 



+2YGKF LAD GGC SGGA YDI 
2801 TACGGCAAGT TCCTTGCCGA CGGCGGGTGC TCGGGGGGCG CTTATGACAT 
ATGCCGTTCA AGGAACGGCT GCCGCCCACG AGCCCCCCGC GAATACTGTA 



+2 IIC DECH STD ATS ILG 
2851 AATAATTTGT GACGAGTGCC ACTCCACGGA TGCCACATCC ATCTTGGGCA 
TTATTAAACA CTGCTCACGG TGAGGTGCCT ACGGTGTAGG TAGAACCCGT 



+2IGTV LDQ AETA GAR L V V 
2 901 TTGGCACTGT CCT TGACCAA GCAGAGACTG CGGGGGCGAG ACTGGTTGTG 
AACCGTGACA GGAACTGGTT CGTCTCTGAC GCCCCCGCTC TGACCAA CAC 



+2 LATA TPP GSV TVPH PNI 
2951 CTCGCCACCG CCACCCCTCC GGGCTCCGTC ACTGTGCCCC ATCCCAACAT 
GAGCGGTGGC GGTGGGGAGG CCCGAGGCAG TGACACGGGG TAGGGTTGTA 



+2 EEV ALST TGE IPF YGK 
3001 CGAGGAGGTT GCTCTGTCCA CCACCGGAGA GATCCCTTTT TACGGCAAGG 
GCTCCTCCAA CGAGACAGGT GGTGGCCTCT CTAGGGAAAA ATGCCGTTCC 



+2AIPL EVI KGGR HLI FCH 
30S1 CTATCCCCCT CGAAGTAATC AAGGGGGGGA GACATCTCAT CTTCTGTCAT 
GATAGGGGGA GCTTCATTAG TTCCCCCCCT CTGTAGAGTA GAAGACAGTA 



+ 2SKKK CDE LAA KLVA LGI 
3101 TCAAAGAAGA AGTGCGACGA ACTCGCCGCA AAGCTGGTCG CATTGGGCAT 
AGTTTCTTCT TCACGCTGCT TGAGCGGCGT TTCGACCAGC GTAACCCGTA 



+2 N A V AYYR GLD V S V IPT 
3151 CAATGCCGTG GCCTACTACC GCGGTCTTGA CGTGTCCGTC ATCCCGACCA 
; GTTACGGCAC CGGATGATGG CGCCAGAACT GCACAGGCAG TAGGGCTGGT 



+2SGDV VVV ATDA LMT GYT 
3201 GCGGCGATGT TGTCGTCGTG GCAACCGATG CCCTCATGAC CGGCTATACC 
CGCCGCTACA ACAGCAGCAC CGTTGGCTAC GGGAGTACTG GCCGATATGG 
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+2GDFD S V I DCN TCVT QTV 
3251 GGCGACTTCG ACTCGGTGAT AGACTGCAAT ACGTGTGTCA CCCAGACAGT 
CCGCTGAAGC TGAGCCACTA TCTGACGTTA TGCACACAGT GGGTCTGTCA 



+2 DFS LDPT FTI ETI TLP 
3301 CGATTTCAGC CTTGACCCTA CCTTCACCAT TGAGACAATC ACGCTCCCCC 
GCTAAAGTCG GAACTGGGAT GGAAGTGGTA ACTCTGTTAG TGCGAGGGGG 



+2QDAV SRT QRRG RTG RGK 
3351 AAGATGCTGT CTCCCGCACT CAACGTCGGG GCAGGACTGG CAGGGGGAAG 
TTCTACGAGA GAGGGCGTGA GTTGCAGCCC CGTCCTGACC GTCCCCCTTC 



+ 2PGIY RFV A P G ERPS GMF 
34 01 CCAGGCATCT ACAGATTTGT GGCACCGGGG GAGCGCCCCT CCGGCATGTT 
GGTCCGTAGA TGTCTAAACA CCGTGGCCCC CTCGCGGGGA GGCCGTACAA 



+ 2 DSS VLCE CYD AGC A W Y 
3451 CGACTCGTCC GTCCTCTGTG AGTGCTATGA CGCAGGCTGT GCTTGGTATG 
GCTGAGCAGG CAGGAGACAC TCACGATACT GCGTCCGACA CGAACCATAC 



+ 2ELTP AET TVRL RAY MNT 
3501 AGCTCACGCC CGCCGAGACT ACAGTTAGGC TACGAGCGTA CATGAACACC 
TCGAGTGCGG GCGGCTCTGA TGTCAATCCG ATGCTCGCAT GTACTTGTGG 



+ 2 PGLP VCQ DHL EFWE GVF 
3551 CCGGGGCTTC CCGTGTGCCA GGACCATCTT GAATTTTGGG AGGGCGTCTT 
GGCCCCGAAG GGCACACGGT CCTGGTAGAA CTTAAAACCC TCCCGCAGAA 



+2 TGL THID AHF LSQ TKQ 
StuI 



3601 TACAGGCCTC ACTCATATAG ATGCCCACTT TCT AT CCCAG ACAAAGCAGA 
ATGTCCGGAG TGAGTATATC TACGGGTGAA AGATAGGGTC TGTTTCGTCT 



+2SGEN LPY LVAY QAT VCA 
3651 GTGGGGAGAA CCTTCCTTAC CTGGTAGCGT ACCAAGCCAC CGTGTGCGCT 
CACCCCTCTT GGAAGGAATG GACCATCGCA TGGTTCGGTG GCACACGCGA 



+2RAQA PPP SWD Q M t? K CLI 
3701 AGGGCTCAAG CCCCTCCCCC ATCGTGGGAC CAGATGTGGA AGTGTTTGAT 
TCCCGAGTTC GGGGAGGGGG TAGCACCCTG GTCTACACCT TCACAAACTA 



+2 RLK PTLH GPT PLL YRL 
37 51 TCGCCTCAAG CCCACCCTCC ATGGGCCAAC ACCCCTGCTA TACAGACTGG 
AGCGGAGTTC GGGTGGGAGG TACCCGGTTG TGGGGACGAT ATGTCTGACC 



+2GAVQ NEI TLTH PVT KYI 
3801 GCGCTGTTCA GAATGAAATC ACCCTGACGC ACCCAGTCAC CAAATACATC 
CGCGACAAGT CTTACTTTAG TGGGACTGCG TGGGTCAGTG GTTTATGTAG 



+2MTCM SAD LEV VTST WVL 
3851 ATGACATGCA TGTCGGCCGA CCTGGAGGTC GTCACGAGCA CCTGGGTGCT 
TACTGTACGT ACAGCCGGCT GGACCTCCAG CAGTGCTCGT GGACCCACGA 



♦2 VGG VLAA X. & A YCX. STG 
3901 CGTTGGCGGC GTCCTGGCTG CTTTGGCCGC GTATTGCCTG TCAACAGGCT 
GCAACCGCCG CAGGACCGAC GAAACCGGCG CATAACGGAC AGTTGTCCGA 
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♦2 C V V I VGR VVLS GKP All 
3951 GCGTCGTCAT AGTGGGCAGG GTCGTCTTGT CCGGGAAGCC GGCAATCATA 
CGCACCAGTA TCACCCGTCC CAGCAGAACA GGCCCTTCGG CCGTTAGTAT 



+2PDRE VLY REF DEME EC 
4 001 CCTGACAGGG AAGTCCTCTA CCGAGAGTTC GATGAGATGG AAGAGTGCTA 
GGACTGTCCC TTCAGGAGAT GGCTCTCAAG CTACTCTACC TTCTCACGAT 



BamHI MluZ 



4 051 GGATCCACTA CGCGTTAGAG CTCGCTGATC AGCCTCGACT GTGCCTTCTA 
CCTAGGTGAT GCGCAATCTC GAGCGACTAG TCGGAGCTGA CACGGAAGAT 



4101 GTTGCCAGCC ATCTGTTGTT TGCCCCTCCC CCGTGCCTTC CTTGACCCTG 
CAACGGTCGG TAGACAACAA ACGGGGAGGG GGCACGGAAG GAACTGGGAC 



4151 GAAGGTGCCA CTCCCACTGT CCTTTCCTAA TAAAATGAGG AAATTGCATC 
CTTCCACGGT GAGGGTGACA GGAAAGGATT ATTTTACTCC TTTAACGTAG 



4201 GCATTGTCTG AGTAGGTGTC ATTCTATTCT GGGGGGTGGG GTGGGGCAGG 
CGTAACAGAC TCATCCACAG TAAGATAAGA CCCCCCACCC CACCCCGTCC 



4251 ACAGCAAGGG GGAGGATTGG GAAGACAATA GCAGGCATGC TGGGGAGCTC 
TGTCGTTCCC CCTCCTAACC CTTCTGTTAT CGTCCGTACG ACCCCTCGAG 



4 301 TTCCGCTTCC TCGCTCACTG ACTCGCTGCG CTCGGTCGTT CGGCTGCGGC 
AAGGCGAAGG AGCGAGTGAC TGAGCGACGC GAGCCAGCAA GCCGACGCCG 



4 351 GAG CGGTAT C AGCTCACTCA AAGGCGGTAA TACGGTTATC CACAGAATCA 
CTCGCCATAG TCGAGTGAGT TTCCGCCATT ATGCCAATAG GTGTCTTAGT 



4401 GGGGATAACG CAGGAAAGAA CATGTGAGCA AAAGGCCAGC AAAAGGCCAG 
CCCCTATTGC GTCCTTTCTT GTACACTCGT TTTCCGGTCG TTTTCCGGTC 



4451 GAACCGTAAA AAGGCCGCGT TGCTGGCGTT TTTCCATAGG CTCCGCCCCC 
CTTGGCATTT TTCCGGCGCA ACGACCGCAA AAAGGTATCC GAGGCGGGGG 



4501 CTGACGAGCA TCACAAAAAT CGACGCTCAA GTCAGAGGTG GCGAAACCCG 
GACTGCTCGT AGTGTTTTTA GCTGCGAGTT CAGTCTCCAC CGCTTTGGGC 



4551 ACAGGACTAT AAAGATACCA GGCGTTTCCC CCTGGAAGCT CCCTCGTGCG 
TGTCCTGATA TTTCTATGGT CCGCAAAGGG GGACCTTCGA GGGAGCACGC 



4601 CTCTCCTGTT CCGACCCTGC CGCTTACCGG ATACCTGTCC GCCTTTCTCC 
GAGAGGACAA GGCTGGGACG GCGAATGGCC TATGGACAGG CGGAAAGAGG 



4 651 CTTCGGGAAG CGTGGCGCTT TCTCAATGCT CACGCTGTAG GTATCTCAGT 
GAAGCCCTTC GCACCGCGAA AGAGTTACGA GTGCGACATC CATAGAGTCA 



4701 TCGGTGTAGG TCGTTCGCTC CAAGCTGGGC TGTGTGCACG AACCCCCCGT 
AGCCACATCC AGCAAGCGAG GTTCGACCCG ACACACGTGC TTGGGGGGCA 



4751 TCAGCCCGAC CGCTGCGCCT TATCCGGTAA CTATCGTCTT GAGTCCAACC 
AGTCGGGCTG GCGACGCGGA AtAGGCCATT GATAGCAGAA CTCAGGTTGG 



4801 CGGTAAGACA CGACTTATCG CCACTGGCAG CAGCCACTGG TAACAGGATT 
: V GCCATTCTGT GCTGAATAGC GG7GACCGTC GTCGGTGACC ATTGTCCTAA 
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4851 


AGCAGAGCGA 
TCGTCTCGCT 


GGTATGTAGG 
CCATACATCC 


CGGTGCTACA 
GCCACGATGT 


GAGTTCTTGA 
CTCAAGAACT 


AGTGGTGGCC 
TCACCACCGG 


4901 


TAACTACGGC 
ATT GATGCCG 


TACACTAGAA 
ATGTGATCTT 


GGACAGTATT 
CCTGTCATAA 


TGGTATCTGC 
ACCATAGACG 


GCTCTGCTGA 
CGAGACGACT 


4951 


AGCCAGTTAC 
TCGGTCAATG 


CTTCGGAAAA 
GAAGCCTTTT 


AGAGTTGGTA 
TCTCAACCAT 


GCTCTTGATC 
CGAGAACTAG 


CGGCAAACAA 
GCCGTTTGTT 


5001 


ACCACCGCTG 
TGGTGGCGAC 


GTAGCGGTGG 
CATCGCCACC 


AAAAAAACAA 


TGCAAGCAGC 
ACGTTCGTCG 


AGATTACGCG 
TCTAATGCGC 


5051 


CAGAAAAAAA 


GGATCTCAAG 
CCTAGAGTTC 


AAGATCCTTT 
TTCTAGGAAA 


GATCTTTTCT 
CTAGAAAAGA 


ACGGGGTCTG 
TGCCCCAGAC 


5101 


ACGCTCAGTG 
TGCGAGTCAC 


GAACGAAAAC 
CTTGCTTTTG 


TCACGTTAAG 
AGTGCAATTC 


GGATTTTGGT 
CCTAAAACCA 


CATGAGATTA 
GTACTCTAAT 


5151 


TCAAAAAGGA 
AGTTTTTCCT 


TCTTCACCTA 
AGAAGTGGAT 


GATCCTTTTA 
CTAGGAAAAT 


AATTAAAAAT 
TTAATTTTTA 


GAAGTTTTAA 
CTTCAAAATT 


5201 


ATCAATCTAA 
TAGTTAGATT 


AGTATATATG 
TCATATATAC 


AGTAAACTTG 
T CATTTGAAC 


GTCTGACAGT 
CAGACTGTCA 


TACCAATGCT 
ATGGTTACGA 


5251 


TAATCAGTGA 
ATTAGTCACT 


GGCACCTATC 
CCGTGGATAG 


TCAGCGATCT 
AGTCGCTAGA 


GTCTATTTCG 
CAGATAAAGC 


TTCATCCATA 
AAGTAGGTAT 


5301 


GTTGCCTGAC 
CAACGGACT G 


TCCCCGTCGT 
AGGGGCAGCA 


GTAGATAACT 
CATCTATTGA 


ACGATACGGG 
TGCTATGCCC 


AGGGCTTACC 
TCCCGAATGG 


5351 


ATCTGGCCCC 
TAGACCGGGG 


AGTGCTGCAA 
TCACGACGTT 


TGATACCGCG 
ACTATGGCGC 


AGACCCACGC 
TCTGGGTGCG 


TCACCGGCTC 
AGTGGCCGAG 


5401 


CAGATTTATC 
GT CT AAAT AG 


AGCAATAAAC 
TCGTTATTTG 


CAGCCAGCCG 
GTCGGTCGGC 


GAAGGGCCGA GCGCAGAAGT 
CTTCCCGGCT CGCGTCTTCA 


5451 


GGTCCTGCAA 
CCAGGACGTT 


CTTTATCCGC 
GAAATAGGCG 


CTCCATCCAG 
GAGGTAGGTC 


TCTATTAATT 
AGATAATTAA 


GTTGCCGGGA 
CAACGGCCCT 


5501 


AG CT AG AGT A 
TCGATCTCAT 


AGTAGTTCGC 
T CAT CAAGCG 


CAGTTAATAG 
GTCAATTATC 


TTTGCGCAAC 
AAACGCGTTG 


GTTGTTGCCA 
CAACAACGGT 


5551 


TTGCTACAGG 
AACGATGTCC 


CATCGTGGTG 
GTAGCACCAC 


TCACGCTCGT 
AGTGCGAGCA 


CGTTTGGTAT 
GCAAACCATA 


GGCTTCATTC 
CCGAAGTAAG 


5601 


AGCTCCGGTT 
TCGAGGCCAA 


CCCAACGATC 
GGGTTGCTAG 


AAGGCGAGTT 
TTCCGCTCAA 


ACATGATCCC 
TGTACTAGGG 


CCATGTTGTG 
GGTACAACAC 


5651 


CAAAAAAGCG 
GTTTTTTCGC 


GTTAGCTCCT 
CAATCGAGGA 


TCGGTCCTCC 
AGCCAGGAGG 


GATCGTTGTC 
CTAGCAACAG 


AGAAGTAAGT 
TCTTCATTCA 


5701 


' TGGCCGCAGT GTTATCACTC ATGGTTATGG CAGCACTGCA TAATTCTCTT 
ACCGGCGTCA CAATAGTGAG TACCAATACC GTCGTGACGT ATTAAGAGAA 


5751 


ACTGTCATGC 
TGACAGTACG 


CATCCGTAAG 
GTAGGCATTC 


ATGCTTTTCT 
TACGAAAAGA 


GTGACTGGTG 
CACTGACCAC 


AGTACTCAAC 
TCATGAGTTG 
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5801 CAAGTCATTC TGAGAATAGT GTATGCGGCG ACCGAGTTGC TCTTGCCCGG 
GTTCAGTAAG ACTCTTATCA CATACGCCGC TGGCTCAACG AGAACGGGCC 



5851 CGTCAATACG GGATAATACC GCGCCACATA GCAGAACTTT AAAAGTGCTC 
GCAGTTATGC CCTATTATGG CGCGGTGTAT CGTCTTGAAA TTTTCACGAG 



5901 ATCATTGGAA AACGTTCTTC GGGGCGAAAA CTCTCAAGGA TCTTACCGCT 
TAGTAACCTT TTGCAAGAAG CCCCGCTTTT GAGAGTTCCT AGAATGGCGA 



5951 GTTGAGATCC AGTTCGATGT AACCCACTCG TGCACCCAAC TGATCTTCAG 
CAACTCTAGG TCAAGCTACA TTGGGTGAGC ACGTGGGTTG ACT AGAAGT C 



6001 CATCTTTTAC TTTCACCAGC GTTTCTGGGT GAGCAAAAAC AGGAAGGCAA 
GTAGAAAATG AAAGTGGTCG CAAAGACCCA CTCGTTTTTG TCCTTCCGTT 



6051 AATGCCGCAA AAAAGGGAAT AAGGGCGACA CGGAAATGTT GAATACTCAT 
TTACGGCGTT TTTTCCCTTA TTCCCGCTGT GCCTTTACAA CTTATGAGTA 



6101 ACTCTTCCTT TTTCAATATT ATTGAAGCAT TTATCAGGGT TATTGTCTCA 
TGAGAAGGAA AAAGTTATAA TAACTTCGTA AATAGTCCCA ATAACAGAGT 



6151 TGAGCGGATA CATATTTGAA TGTATTTAGA AAAATAAACA AATAGGGGTT 
ACTCGCCTAT GTATAAACTT ACATAAATCT TTTTATTTGT TTATCCCCAA 



6201 CCGCGCACAT TTCCCCGAAA AGTGCCACCT GACGTCTAAG AAACCAT TAT 
GGCGCGTGTA AAGGGGCTTT TCACGGTGGA CTGCAGATTC TTTGGTAATA 



6251 TATCATGACA TTAACCTATA AAAATAGGCG TATCACGAGG CCCTTTCGTC 
ATAGTACTGT AATTGGATAT TTTTATCCGC ATAGTGCTCC GGGAAAGCAG 
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MetAlaAlaTyrAlaAlaGlnGlyTyrLysVaXLeuVal 
2 AGCTTACAAAACAAATTCACCATGGCTGCATATGCAGCTCAGGGCTATAAGGTGCTAGTA 
TCGAATGTTTTGTTTAAGTGGTACCGACGTATACGTCGAGTCCCGATATTCCACGATCAT 

1 HIND3, 21 NCOI, 30 NDEI, 58 SCAI, 

LeuAsnProSerValAlaAlaThrLeuGlyPheGlyAlaTyrMetSerLysAlaHisGly 
62 CTCAACCCCTCTGTTGCTGCAACACTGGGCTTTGGTGCTTACATGTCCAAGGCTCATGGG 
GAGTTGGGGAGACAACGACGTTGTGACCCGAAACCACGAATGTACAGGTTCCGAGTACCC 

IleAspProAsnlleArgThrGlyValArgThrlleThrThrGlySerProIleThrTyr 
122 ATCGATCCTAACATCAGGACCGGGGTGAGAACAATTACCACTGGCAGCCCCATCACGTAC 
TAGCTAGGATTGTAGTCCTGGCCCCACTCTTGTTAATGGTGACCGTCGGGGTAGTGCATG 

122 CLAI, 

SerThrTyrGlyLysPheLeuAlaAspGlyGlyCysSerGlyGlyAlaTyrAspIlelle 
182 TCCACCTACGGCAAGTTCCTTGCCGACGGCGGGTGCTCGGGGGGCGCTTATGACATAATA 
AGGTGGATGCCGTTCAAGGAACGGCTGCCGCCCACGAGCCCCCCGCGAATACTGTATTAT 

IleCysAspGluCysHisSerThrAspAlaThrSerlleLeuGlylleGlyThrValLeu 
242 ATTTGTGACGAGTGCCACTCCACGGATGCCACATCCATCTTGGGCATTGGCACTGTCCTT 
TAAACACTGCTCACGGTGAGGTGCCTACGGTGTAGGTAGAACCCGTAACCGTGACAGGAA 

AspGlnAlaGluThrAlaGlyAlaArgLeuValValLeuAlaThrAlaThrProProGly 
302 GACCAAGCAGAGACTGCGGGGGCGAGACTGGTTGTGCTCGCCACCGCCACCCCTCCGGGC 
CTGGTTCGTCTCTGACGCCCCCGCTCTGACCAACACGAGCGGTGGCGGTGGGGAGGCCCG 

309 ALWN1, 

SerValThrValProHisProAsnlleGluGluValAlaLeuSerThrThrGlyGluIle 
362 TCCGTCACTGTGCCCCATCCCAACATCGAGGAGGTTGCTCTGTCCACCACCGGAGAGATC 
AGGCAGTGACACGGGGTAGGGTTGTAGCTCCTCCAACGAGACAGGTGGTGGCCTCTCTAG 

ProPheTyrGlyLysAlalleProLeuGluVallleLysGlyGlyArgHisLeuIlePhe 
422 CCTTTTTACGGCAAGGCTATCCCCCTCGAAGTAATCAAGGGGGGGAGACATCTCATCTTC 
GGAAAAATGCCGTTCCGATAGGGGGAGCTTCATTAGTTCCCCCCCTCTGTAGAGTAGAAG 

CysHisSerLysLysLysCysAspGluLeuAlaAlaLysLeuValAlaLeuGlylleAsn 
482 TGTCATTCAAAGAAGAAGTGCGACGAACTCGCCGCAAAGCTGGTCGCATTGGGCATCAAT 
ACAGTAAGTTTCTTCTTCACGCTGCTTGAGCGGCGTTTCGACCAGCGTAACCCGTAGTTA 

AlaValAlaTyrTyrArgGlyLeuAspValSerVallleProThrSerGlyAspValVal 
542 GCCGTGGCCTACTACCGCGGTCTTGACGTGTCCGTCATCCCGACCAGCGGCGATGTTGTC 
CGGCACCGGATGATGGCGCCAGAACTGCACAGGCAGTAGGGCTGGTCGCCGCTACAACAG 

556 SAC2, 566 DRD1, 

ValValAlaThrAspAlaLeuMetThrGlyTyrThrGlyAspPheAspSerVallleAsp 
6.02 ' GTCGTGGCAACCGATGCCCTCATGACCGGCTATACCGGCGACTTCGACTCGGTGATAGAC 
CAGCACCGTTGGCTACGGGAGTACTGGCCGATATGGCCGCTGAAGCTGAGCCACTATCTG 

621 BSPH1, 

'■'Ji 

CysAsnThrCysValThrGlnThrValAspPheSerLeuAspProThrPheThrlleGlu 
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662 TGCAATACGTGTGTCACCCAGACAGTCGATTTCAGCCTTGACCCTACCTTCACCATTGAG 
ACGTTATGCACACAGTGGGTCTGTCAGCTAAAGTCGGAACTGGGATGGAAGTGGTAACTC 

ThrXleThrLeuProGlnAspAlaValSerArgThrGlnArgArgGlyArgThrGlyArg 
722 ACAATCACGCTCCCCCAAGATGCTGTCTCCCGCACTCAACGTCGGGGCAGGACTGGCAGG 
TGTTAGTGCGAGGGGGTTCTACGACAGAGGGCGTGAGTTGCAGCCCCGTCCTGACCGTCC 

GlyLysProGlylleTyrArgPheValAlaProGlyGluArgProSerGlyMetPheAsp 
782 GGGAAGCCAGGCATCTACAGATTTGTGGCACCGGGGGAGCGCCCCTCCGGCATGTTCGAC 
CCCTTCGGTCCGTAGATGTCTAAACACCGTGGCCCCCTCGCGGGGAGGCCGTACAAGCTG 

822 BGLI, 839 DRD1, 

SerSerValLeuCysGluCysTyrAspAlaGlyCysAlaTrpTyrGluLeuThrProAla 
84 2 TCGTCCGTCCTCTGTGAGTGCTATGACGCAGGCTGTGCTTGGTATGAGCTCACGCCCGCC 
AGCAGGCAGGAGACACTCACGATACTGCGTCCGACACGAACCATACTCGAGTGCGGGCGG 

887 SAC I. 

GluThrThrValArgLeuArgAlaTyrMetAsnThrProGlyLeuProValCysGlnAsp 
902 GAGACTACAGTTAGGCTACGAGCGTACATGAACACCCCGGGGCTTCCCGTGTGCCAGGAC 
CTCTGATGTCAATCCGATGCTCGCATGTACTTGTGGGGCCCCGAAGGGCACACGGTCCTG 

937 SMAI XMAI, 

HisLeuGluPheTrpGluGlyValPheThrGlyLeuThrHisIleAspAlaHisPheLeu 
962 CATCTTGAATTTTGGGAGGGCGTCTTTACAGGCCTCACTCATATAGATGCCCACTTTCTA 
GTAGAACTTAAAACCCTCCCGCAGAAATGTCCGGAGTGAGTATATCTACGGGTGAAAGAT 

991 STOI, 

SerGlnThrLysGlnSerGlyGluAsnLeuProTyrLeuValAlaTyrGlnAlaThrVal 
1022 TCCCAGACAAAGCAGAGTGGGGAGAACCTTCCTTACCTGGTAGCGTACCAAGCCACCGTG 
AGGGTCTGTTTCGTCTCACCCCTCTTGGAAGGAATGGACCATCGCATGGTTCGGTGGCAC 

1075 DRA3, 

CysAlaArgAlaGlnAlaProProProSerTrpAspGlnMetTrpLysCysLeuIleArg 
1082 TGCGCTAGGGCTCAAGCCCCTCCCCCATCGTGGGACCAGATGTGGAAGTGTTTGATTCGC 
ACGCGATCCCGAGTTCGGGGAGGGGGTAGCACCCTGGTCTACACCTTCACAAACTAAGCG 

LeuLysProThrLeuHisGlyProThrProLeuLeuTyrArgLeuGlyAlaValGlnAsn 
1142 CTCAAGCCCACCCTCCATGGGCCAACACCCCTGCTATACAGACTGGGCGCTGTTCAGAAT 
GAGTTCGGGTGGGAGGTACCCGGTTGTGGGGACGATATGTCTGACCCGCGACAAGTCTTA 

1156 NCOI, 

GluIleThrLeuThrHisProValThrLysTyrlleMetThrCysMetSerAlaAspLeu 
1202 GAAATCACCCTGACGCACCCAGTCACCAAATACATCATGACATGGATGTCGGCCGACCTG 
. . CTTTAGTGGGACTGCGTGGGTCAGTGGTTTATGTAGTACTGTACGTACAGCCGGCTGGAC 

1236 BSPH1, 1240 DRD1, 1243 AVA3# 1251 EAG1 XMA3, 1256 DRD1, 



GluValValThrS rThrTrpValL uValGlyGlyValLeuAlaAlaLeuAlaAlaTyr 
1262 GAGGTCGTCACGAGCACCTGGGTGCTCGTTGGCGGCGTCCTGGCTGCTTTGGCCGCGTAT 
CTCCAGCAGTGCTCGTGGACCCACGAGCAACCGCCGCAGGACCGACGAAACCGGCGCATA 
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CysLeuSerThrGlyCysValValXleValGlyArgValValLeuSerGlyLysPr Ala 
TGCCTGTCAACAGGCTGCGTGGTCATAGTGGGCAGGGTCGTCTTGTCCGGGAAGCCGGCA 
ACGGACAGTTGTCCGACGCACCAGTATCACCCGTCCCAGCAGAACAGGCCCTTCGGCCGT 



IlelleProAspArgGluValLeuTyrArgGluPheAspGluMetGluGluCysSerGln 
1382 ATCATACCTGACAGGGAAGTCCTCTACCGAGAGTTCGATGAGATGGAAGAGTGCTCTCAG 
TAGTATGGACTGTCCCTTCAGGAGATGGCTCTCAAGCTACTCTACCTTCTCACGAGAGTC 

1391 DRD1, 

HisLeuProTyrlleGluGlnGlyMetMetLeuAlaGluGlnPheLysGlnLysAlaLeu 
1442 CACTTACCGTACATCGAGCAAGGGATGATGCTCGCCGAGCAGTTCAAGCAGAAGGCCCTC 
GTGAATGGCATGTAGCTCGTTCCCTACTACGAGCGGCTCGTCAAGTTCGTCTTCCGGGAG 

GlyLeuLeuGlnThrAlaSerArgGlnAlaGluVallleAlaProAlaValGlnThrAsn 

1502 GGCCTGCTGCAGACCC-CGTCCCGTCAGGCAGAGGTTATCGCCCCTGCTGTCCAGACCAAC 
CCGGAGGACGTCTGGCGCAGGGCAGTCCGTCTCCAATAGCGGGGACGACAGGTCTGGTTG 

1508 PSTI, 1513 TTH3I, 

TrpGlnLysLeuGluThrPheTrpAlaLysHisMetTrpAsnPhelleSerGlylleGln 
1562 TGGCAAAAACTCGAGACCTTCTGGGCGAAGCATATGTGGAACTTCATCAGTGGGATACAA 
ACCGTTTTTGAGCTCTGGAAGACCCGCTTCGTATACACCTTGAAGTAGTCACCCTATGTT 

1571 XHOI, 1592 NDEI, 

TyrLeuAlaGlyLeuSerThrLeuProGlyAsnProAlalleAlaSerLeuMetAlaPhe 
1622 . TACTTGGCGGGCTTGTCAACGCTGCCTGGTAACCCCGCCATTGCTTCATTGATGGCTTTT 
ATGAACCGCCCGAACAGTTGCGACGGACCATTGGGGCGGTAACGAAGTAACTACCGAAAA 



ThrAlaAlaValThrSerProLeuThrThrSerGlnThrLeuLeuPheAsnlleLeuGly 
1682 ACAGCTGCTGTCACCAGCCCACTAACCACTAGCCAAACCCTCCTCTTCAACATATTGGGG 
TGTCGACGACAGTGGTCGGGTGATTGGTGATCGGTTTGGGAGGAGAAGTTGTATAACCCC 

1683 ALWN1 PVU2, 

GlyTrpValAlaAlaGlnLeuAlaAlaProGlyAlaAlaThrAlaPheValGlyAlaGly 
17 42 GGGTGGGTGGCTGCCCAGCTCGCCGCCCCCGGTGCCGCTACTGCCTTTGTGGGCGCTGGC 
CCCACCCACCGACGGGTCGAGCGGCGGGGGCCACGGCGATGACGGAAACACCCGCGACCG 



LeuAlaGlyAlaAlalleGlySerValGlyLeuGlyLysValLeuIleAspIleLeuAla 
TTAGCTGGCGCCGCCATCGGCAGTGTTGGACTGGGGAAGGTCCTCATAGACATCCTTGCA 
AATCGACCGCGGCGGTAGCCGTCACAACCTGACCCCTTCCAGGAGTATCTGTAGGAACGT 

1808. KAS1 NARI, 

GlyTyrGlyAlaGlyValAlaGlyAlaLeuValAlaPheLysIleMetSerGlyGluVal 
GGGTATGGCGCGGGCGTGGCGGGAGCTCTTGTGGCATTCAAGATCATGAGCGGTGAGGTC 
CCCATACCGCGCCCGCACCGCCCTCGAGAACACCGTAAGTTCTAGTACTCGCCACTCCAG 
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1884 SACI, 1905 BSPH1, 

ProSerThrGluAspLeuVaiAsnLeuLeuProAlalleLeuSerProGlyAlaLeuVal 
1922 CCCTCCACGGAGGACCTGGTCAATCTACTGCCCGCCATCCTCTCGCCCGGAGCCCTCGTA 
GGGAGGTGCCTCCTGGACCAGTTAGATGACGGGCGGTAGGAGAGCGGGCCTCGGGAGCAT 

1934 TTH3I, 

ValGlyValValCysAlaAlalleLeuArgArgHisValGlyProGlyGluGlyAlaVal 
1982 GTCGGCGTGGTCTGTGCAGCAATACTGCGCCGGCACGTTGGCCCGGGCGAGGGGGCAGTG 
CAGCCGCACCAGACACGTCGTTATGACGCGGCCGTGCAACCGGGCCCGCTCCCCCGTCAC 

2010 NAEI , 2023 SMAI XMAI, 

GlnTrpMetAsnArgLeuIleAlaPheAlaSerArgGlyAsnHisValSerProThrHis 
2042 CAGTGGATGAACCGGCTGATAGCCTTCGCCTCCCGGGGGAACCATGTTTCCCCCACGCAC 
GTCACCTACTTGGCCGACTATCGGAAGCGGAGGGCCCCCTTGGTACAAAGGGGGTGCGTG 

2073 SMAI XMAI, 2099 DRA3, 1 

TyrValProGluSerAspAlaAlaAlaArgValThrAlalleLeuSerSerLeuThrVal 
2102 TACGTGCCGGAGAGCGATGCAGCTGCCCGCGTCACTGCCATACTCAGCAGCCTCACTGTA 
ATGCACGGCCTCTCGCTACGTCGACGGGCGCAGTGACGGTATGAGTCGTCGGAGTGACAT 

2121 PVU2, 

ThrGlnLeuLeuArgArgLeuHisGlnTrpIleSerSerGluCysThrThrProCysSer 
2162 ACCCAGCTCCTGAGGCGACTGCACCAGTGGATAAGCTCGGAGTGTACCACTCCATGCTCC 
TGGGTCGAGGACTCCGCTGACGTGGTCACCTATTCGAGCCTCACATGGTGAGGTACGAGG 

2165 ALWN1, 2170 MST2, 

GlySerTrpLeuArgAspIleTrpAspTrpIleCysGluValLeuSerAspPheLysThr 
2222 GGTTCCTGGCTAAGGGACATCTGGGACTGGATATGCGAGGTGTTGAGCGACTTTAAGACC 
CCAAGGACCGATTCCCTGTAGACCCTGACCTATACGCTCCACAACTCGCTGAAATTCTGG 

2226 EC0N1, 

TrpLeuLysAlaLysLeuMetProGlnLeuProGlylleProPheValSerCysGlnArg 
2282 TGGCTAAAAGCTAAGCTCATGCCACAGCTGCCTGGGATCCCCTTTGTGTCCTGCCAGCGC 
ACCGATTTTCGATTCGAGTACGGTGTCGACGGACCCTAGGGGAAACACAGGACGGTCGCG 

2291 ESP1, 2306 PVU2, 2316 BAMHI, 

GlyTyrLysGlyValTrpArgGlyAspGlylleMetHisThrArgCysHisCysGlyAla 
2342 GGGTATAAGGGGGTCTGGCGAGGGGACGGCATCATGCACACTCGCTGCCACTGTGGAGCT 
CCCATATTCCCCCAGACCGCTCCCCTGCCGTAGTACGTGTGAGCGACGGTGACACCTCGA 

GluIleThrGlyHisValLysAsnGlyThrMetArglleValGlyProArgThrCysArg 
2402 GAGATCACTGGACATGTCAAAAACGGGACGATGAGGATCGTCGGTCCTAGGACCTGCAGG 
: GTCTAGTGACCTGTACAGTTTTTGCCCTGCTACTCCTAGCAGCCAGGATCCTGGACGTCC 

2431 BSAB1, 2447 AVR2, 2454 SSE83871, 2455 PSTI, 

A AsnMetTrpSerGlyThrPhePr IleAsnAlaTyrThrThrGlyProCysThrProLeu 
■ . 24 62 AACATGTGGAGTGGGACCTTCCCCATTAATGCCTACACCACGGGCCCCTGTACCCCCCTT 
TTGTACACCTCACCCTGGAAGGGGTAATTACGGATGTGGTGCCCGGGGACATGGGGGGAA 
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2€86 ASE1, 2503 APAI, 

ProAlaProAsnTyrThrPheAlal^uTrpArgValSerAlaGluGluTyrValGluIle 
2522 CCTGCGCCGAACT ACACGTTCGCGCTATGGAGGGTGTCTGCAGAGGAAT ACGTGGAGATA 
GGACGCGGCTTGATGTGCAAGCGCGATACCTCCCACAGACGTCTCCTTATGCACCTCTAT 

2559 PSTI, 

ArgGlnValGlyAspPheHisTyrValThrGlyMetThrThrAspAsnLeuLysCysPro 
2582 AGGCAGGTGGGGGACTTCCACTACGTGACGGGTATGACTACTGACAATCTTAAATGCCCG 
TCCGTCCACCCCCTGAAGGTGATGCACTGCCCATACTGATGACTGTTAGAATTTACGGGC 

2600 DRA3, 

CysGlnValProSerProGluPhePheThrGluLeuAspGlyValArgLeuHisArgPhe 
2642 TGCCAGGTCCCATCGCCCGAATTTTTCACAGAATTGGACGGGGTGCGCCTACATAGGTTT 
ACGGTCCAGGGTAGCGGGCTTAAAAAGTGTCTTAACCTGCCCCACGCGGATGTATCCAAA 

AlaProProCysLysProLeuLeuArgGluGluValSerPheArgVaiGlyLeuHisGlu 
2702 GCGCCCCCCTGCAAGCCCTTGCTGCGGGAGGAGGTATCATTCAGAGTAGGACTCCACGAA 
CGCGGGGGGACGTTCGGGAACGACGCCCTCCTCCATAGTAAGTCTCATCCTGAGGTGCTT 

TyrProValGlySerGlnLeuProCysGluProGluProAspValAlaValLeuThrSer 
2762 TACCCGGTAGGGTCGCAATTACCTTGCGAGCCCGAACCGGACGTGGCCGTGTTGACGTCC 
ATGGGCCATCCCAGCGTTAATGGAACGCTCGGGCTTGGCCTGCACCGGCACAACTGCAGG 

2763 HGIE2, 2815 AAT2, 

MetLeuThrAspProSerHisIleThrAlaGluAlaAlaGlyArgArgLeuAlaArgGly 
2822 ATGCTCACTGATCCCTCCCATATAACAGCAGAGGCGGCCGGGCGAAGGTTGGCGAGGGGA 
TACGAGTGACTAGGGAGGGTATATTGTCGTCTCCGCCGGCCCGCTTCCAACCGCTCCCCT 

2856 EAG1 XMA3, 

SerProProSerValAlaSerSerSerAlaSerGlnLeuSerAlaProSerLeuLysAla 
2882 TCACCCCCCTCTGTGGCCAGCTCCTCGGCTAGCCAGCTATCCGCTCCATCTCTCAAGGCA 
AGTGGGGGGAGACACCGGTCGAGGAGCCGATCGGTCGATAGGCGAGGTAGAGAGTTCCGT 

2895 BALI, 2909 NHEI, 

ThrCysThrAlaAsnHisAspSerProAspAlaGluLeuIleGluAlaAsnLeuLeuTrp 
2942 ACTTGCACCGCTAACCATGACTCCCCTGATGCTGAGCTCATAGAGGCCAACCTCCTATGG 
TGAACGTGGCGATTGGTACTGAGGGGACTACGACTCGAGTATCTCCGGTTGGAGGATACC 

2972 ESP1, 2975 SACI, 

ArgGlnGluMetGlyGlyAsnlleThrArgValGluSerGluAsnLysValVallleLeu 
,.3J002\ , AGGCAGGAGATGGGCGGCAACATCACCAGGGTTGAGTCAGAAAACAAAGTGGTGATTCTG 
TCCGTCCTCTACCCGCCGTTGTAGTGGTCCCAACTCAGTCTTTTGTTTCACCACTAAGAC 

AspSerPheAspProL uValAlaGluGluAspGluArgGluIleSerValProAlaGlu 
3062 - v- GACTCCTTCGATCCGCTTGTGGCGGAGGAGGACGAGCGGGAGATCTCCGTACCCGCAGAA 
CTGAGGAAGCTAGGCGAACACCGCCTCCTCCTGCTCGCCCTCTAGAGGCATGGGCGTCTT 

11 3102 BGL2, 
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IleLeuArgLysSerArgArgPheAlaGlnAlaLeuPr ValTrpAlaArgPr AspTyr 
3122 ATCCTGCGGAAGTCTCGGAGATTCGCCCAGGCCCTGCCCGTTTGGGCGCGGCCGGACTAT 
TAGGACGCCTTCAGAGCCTCTAAGCGGGTCCGGGACGGGCAAACCCGCGCCGGCCTGATA 

3149 ALWN1, 3170 EAG1 XMA3, 

AsnProProLeuValGluThrTrpLysLysProAspTyrGluProProValValHisGly 
3182 AACCCCCCGCTAGTGGAGACGTGGAAAAAGCCCGACTACGAACCACCTGTGGTCCATGGC 
TTGGGGGGCGATCACCTCTGCACCTTTTTCGGGCTGATGCTTGGTGGACACCAGGTACCG 

3223 HGIE2, 3235 NCOI, 

CysProLeuProProProLysSerProProValProProProArgLysLysArgThrVal 
324 2 TGCCCGCTTCCACCTCCAAAGTCCCCTCCTGTGCCTCCGCCTCGGAAGAAGCGGACGGTG 
ACGGGCGAAGGTGGAGGTTTCAGGGGAGGACACGGAGGCGGAGCCTTCTTCGCCTGCCAC 

ValLeuThrGluSerThrLeuSerThrAlaLeuAlaGluLeuAlaThrArgSerPheGly 
3302 GTCCTCACTGAATCAACCCTATCTACTGCCTTGGCCGAGCTCGCCACCAGAAGCTTTGGC 
CAGGAGTGACTTAGTTGGGATAGATGACGGAACCGGCTCGAGCGGTGGTCTTCGAAACCG 

3338 SACI, 3352 HIND3, 

SerSerSerThrSerGlylleThrGlyAspAsnThrThrThrSerSerGluProAlaPro 
3362 AGCTCCTCAACTTCCGGCATTACGGGCGACAATACGACAACATCCTCTGAGCCCGCCCCT 
TCGAGGAGTTGAAGGCCGTAATGCCCGCTGTTATGCTGTTGTAGGAGACTCGGGCGGGGA 

SerGlyCysProProAspSerAspAlaGluSerTyrSerSerMetProProLeuGluGly 
34 22 TCTGGCTGCCCCCCCGACTCCGACGCTGAGTCCT ATTCCTCCATGCCCCCCCTGGAGGGG 
AGACCGACGGGGGGGCTGAGGCTGCGACTCAGGATAAGGAGGTACGGGGGGGACCTCCCC 

3443 EAM11051, 

GluProGlyAspProAspLeuSerAspGlySerTrpSerThrValSerSerGluAlaAsn 
3482 GAGCCTGGGGATCCGGATCTT AGCGACGGGTCATGGTCAACGGTCAGT AGTGAGGCCAAC 
CTCGGACCCCTAGGCCTAGAATCGCTGCCCAGTACCAGTTGCCAGTCATCACTCCGGTTG 

3490 BAMHI, 3491 BSAB1, 3493 BSPE1, 

AlaGluAspValValCysCysSerMetSerTyrSerTrpThrGlyAlaLeuValThrPro 
3542 GCGGAGGATGTCGTGTGCTGCTCAATGTCTTACTCTTGGACAGGCGCACTCGTCACCCCG 
CGCCTCCTACAGCACACGACGAGTTACAGAATGAGAACCTGTCCGCGTGAGCAGTGGGGC 

3595 DRA3, 

CysAlaAlaGluGluGlnLysLeuProIleAsnAlaLeuSerAsnSerLeuLeuArgHis 
3602 TGCGCCGCGGAAGAACAGAAACTGCCCATCAATGCACTAAGCAACTCGTTGCTACGTCAC 
ACGCGGCGCCTTCTTGTCTTTGACGGGTAGTTACGTGATTCGTTGAGCAACGATGCAGTG 

3606 SAC2, 3617 ALWN1, 3661 PFLM1, 

HisAsnLeuValTyrSerThrThrSerArgSerAlaCysGlnArgGlnLysLysValThr 
3662 ' CACAATTTGGTGTATTCCACCACCTCACGCAGTGCTTGCCAAAGGCAGAAGAAAGTCACA 
GTGTTAAACCACATAAGGTGGTGGAGTGCGTCACGAACGGTTTCCG7CTTCTTTCAGTGT 

' ! 3687 DRA3, 

PheAspArgL uGlnValLeuAspSerHisTyrGlnAspValLeuLysGluValLysAla 
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3722 TTTGACAGACTGCAAGTTCTGGACAGCCATTACCAGGACGTACTCAAGGAGGTTAAAGCA 
AAACTGTCTGACGTTCAAGACCTGTCGGTAATGGTCCTGCATGAGTTCCTCCAATTTCGT 

AlaAlaSerLysValLysAlaAsnLeuLeuSerValGluGluAlaCysSerLeuThrPro 
3782 GCGGCGTCAAAAGTGAAGGCTAACTTGCTATCCGTAGAGGAAGCTTGCAGCCTGACGCCC 
CGCCGCAGTTTTCACTTCCGATTGAACGATAGGCATCTCCTTCGAACGTCGGACTGCGGG 

3822 HIND3, 

ProHisSerAlaLysSerLysPheGlyTyrGlyAlaLysAspValArgCysHisAlaArg 
3842 CCACACTCAGCCAAATCCAAGTTTGGTTATGGGGCAAAAGACGTCCGTTGCCATGCCAGA 
GGTGTGAGTCGGTTTAGGTTCAAACCAATACCCCGTTTTCTGCAGGCAACGGTACGGTCT 

3881 AAT2, 3896 BGLI, 

LysAlaValThrHisILeAsnSerValTrpLysAspLeuLeuGluAspAsnValThrPro 
3902 AAGGCCGTAACCCACATCAACTCCGTGTGGAAAGACCTTCTGGAAGACAATGT AACACCA 
TTCCGGCATTGGGTGTAGTTGAGGCACACCTTTCTGGAAGACCTTCTGTTACATTGTGGT 

IleAspThrThrlleMetAlaLysAsnGluValPheCysValGlnProGluLysGlyGly 
3962 ATAGACACTACCATCATGGCTAAGAACGAGGTTTTCTGCGTTCAGCCTGAGAAGGGGGGT 
TATCTGTGATGGTAGTACCGATTCTTGCTCCAAAAGACGCAAGTCGGACTCTTCCCCCCA 

ArgLysProAlaArgLeuIleValPheProAspLeuGlyValArgValCysGluLysMet 
4022 CGT AAGCCAGCTCGTCTCATCGTGTTCCCCGATCTGGGCGTGCGCGTGTGCGAAAAGATG 
GCATTCGGTCGAGCAGAGTAGCACAAGGGGCTAGACCCGCACGCGCACACGCTTTTCTAC 

AlaLeuTyrAspValValThrLysLeuProLeuAlaValMetGlySerSerTyrGlyPhe 
4 082 GCTTTGTACGACGTGGTTACAAAGCTCCCCTTGGCCGTGATGGGAAGCTCCTACGGATTC 
CGAAACATGCTGCACCAATGTTTCGAGGGGAACCGGCACTACCCTTCGAGGATGCCTAAG 

GlnTyrSerProGlyGlnArgValGluPheLeuValGlnAlaTrpLysSerLysLysThr 
414 2 CAATACTCACCAGGACAGCGGGTTGAATTCCTCGTGCAAGCGTGGAAGTCCAAGAAAACC 
GTTATGAGTGGTCCTGTCGCCCAACTTAAGGAGCACGTTCGCACCTTCAGGTTCTTTTGG 

4166 ECORI, 

ProMetGlyPheSerTyrAspThrArgCysPheAspSerThrValThrGluSerAspIle 
4202 CCAATGGGGTTCTCGTATGATACCCGCTGCTTTGACTCCACAGTCACTGAGAGCGACATC 
GGTTACCCCAAGAGCATACTATGGGCGACGAAACTGAGGTGTCAGTGACTCTCGCTGTAG 

4235 DRD1, 4242 ALWN1, 

ArgThrGluGluAlalleTyrGlnCysCysAspLeuAspProGlnAlaArgValAlalle 
4262 CGTACGGAGGAGGCAATCTACCAATGTTGTGACCTCGACCCCCAAGCCCGCGTGGCCATC 
GCATGCCTCCTCCGTTAGATGGTTACAACACTGGAGCTGGGGGTTCGGGCGCACCGGTAG 

4307 BGLI, 4314 BALI, 

LysSerLeuThrGluArgLeuTyrValGlyGlyProLeuThrAsnSerArgGlyGluAsn 
4322 AAGTCCCTCACCGAGAGGCTTTATGTTGGGGGCCCTCTTACCAATTCAAGGGGGGAGAAC 
TTCAGGGAGTGGCTCTCCGAAATACAACCCCCGGGAGAATGGTTAAGTTCCCCCCTCTTG 

4351 APAI, 

CysGlyTyrArgArgCysArgAlaSerGlyValLcuThrThrSerCysGlyAsnThrLeu 
4 382 TGCGGCTATCGCAGGTGCCGCGCGAGCGGCGTACTGACAACTAGCTGTGGTAACACCCTC 
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ACGCCGATAGCGTCCACGGCCCGCTCGCCGCATGACTGTTGATCGACACCATTGTGGGAG 

ThrCysTysrlleLysAiaArgAlaAlaCysArgAlaAlaGlyLouGlnAspCysThrMet 
4 442 ACTTGCTACATCAAGGCCCGGGCAGCCTGTCGAGCCGCAGGGCTCCAGGACTGCACCATG 
TGAACGATGTAGTTCCGGGCCCGTCGGACAGCTCGGCGTCCCGAGGTCCTGACGTGGTAC 

4 458 SMAI XMAI, 

LeuValCysGlyAspAspLeuValVallleCysGluSerAlaGlyValGlnGluAspAla 
4502 CTCGTGTGTGGCGACGACTTAGTCGTTATCTGTGAAAGCGCGGGGGTCCAGGAGGACGCG 
GAGCACACACCGCTGCTGAATCAGCAATAGACACTTTCGCGCCCCCAGGTCCTCCTGCGC 

4514 DRD1 , 4517 TTH3I, 

AlaSerLeuArgAlaPheThrGluAlaMetThrArgTyrSerAlaProProGlyAspPro 
4 562 GCGAGCCTGAGAGCCTTCACGGAGGCTATGACCAGGTACTCCGCCCCCCCTGGGGACCCC 
CGCTCGGACTCTCGGAAGTGCCTCCGATACTGGTCCATGAGGCGGGGGGGACCCCTGGGG 

ProGinProGluTyrAspLeuGluLeuIleThrSerCysSerSerAsnValSerValAla 
4 622 CCACAACCAGAATACGACTTGGAGCTCATAACATCATGCTCCTCCAACGTGTCAGTCGCC 
GGTGTTGGTCTTATGCTGAACCTCGAGTATTGTAGTACGAGGAGGTTGCACAGTCAGCGG 

4 64 3 SACI, 

HisAspGlyAlaGlyLysArgValTyrTyrLeuThrArgAspProThrThrProLeuAla 
4 682 CACGACGGCGCTGGAAAGAGGGTCT ACTACCTCACCCGTGACCCTACAACCCCCCTCGCG 
GTGCTGCCGCGACCTTTCTCCCAGATGATGGAGTGGGCACTGGGATGTTGGGGGGAGCGC 

4737 NRUI, 

ArgAlaAlaTrpGluThrAlaArgHisThrProValAsnSerTrpLeuGlyAsnllelle 
4 742 AGAGCTGCGTGGGAGACAGCAAGACACACTCCAGTCAATTCCTGGCTAGGCAACATAATC 
TCTCGACGCACCCTCTGTCGTTCTGTGTGAGGTCAGTTAAGGACCGATCCGTTGTATTAG 

MetPheAlaProThrLeuTrpAlaArgMetlleLeuMetThrHisPhePheSerValLeu 
4 802 ATGTTTGCCCCCACACTGTGGGCGAGGATGATACTGATGACCCATTTCTTTAGCGTCCTT 
TACAAACGGGGGTGTGACACCCGCTCCTACTATGACTACTGGGTAAAGAAATCGCAGGAA 

4812 PFLM1, 4813 DRA3, 

IleAlaArgAspGlnLeuGluGlnAlaLeuAspCysGluIleTyrGlyAlaCysTyrSer 
4 8 62 ATAGCCAGGGACCAGCTTGAACAGGCCCTCGATTGCGAGATCTACGGGGCCTGCTACTCC 
TATCGGTCCCTGGTCGAACTTGTCCGGGAGCTAACGCTCTAGATGCCCCGGACGATGAGG 

4 899 BGL2, 

IleGluProLeuAspLeuProProIlelleGlnArgLeuHisGlyLeuSerAlaPheSer 
4 922 ATAGAACCACTGGATCTACCTCCAATCATTCAAAGACTCCATGGCCTCAGCGCATTTTCA 
TATCTTGGTGACCTAGATGGAGGTTAGTAAGTTTCTGAGGTACCGGAGTCGCGTAAAAGT 

•V 4960 NCOI, 

LeuHisSerTyrSerProGlyGluIleAsnArgValAlaAlaCysLeuArgLysLeuGly 
4 982 CTCCACAGTTACTCTCCAGGTGAAATCAATAGGGTGGCCGCATGCCTCAGAAAACTTGGG 
' GAGGTGTCAATGAGAGGTCCACTTTAGTTATCCCACCGGCGTACGGAGTCTTTTGAACCC 

5021 SPHI, 5041 KPNI, 
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ValProProLeuArgAlaTrpArgHisArgAlaArgSerValArgAlaArgLeuLeuAla 
5042 GTACCGCCCTTGCGAGCTTGGAGACACCGGGCCCGGAGCGTCCGCGCTAGGCTTCTGGCC 
CATGGCGGGAACGCTCGAACCTCTGTGGCCCGGGCCTCGCAGGCGCGATCCGAAGACCGG 

5070 APAI, 5097 BALI, 

ArgGlyGlyArgAlaAlalleCysGlyLysTyrLeuPheAsnTrpAlaValArgThrLys 
5102 AGAGGAGGCAGGGCTGCCATATGTGGCAAGTACCTCTTCAACTGGGCAGTAAGAACAAAG 
TCTCCTCCGTCCCGACGGTATACACCGTTCATGGAGAAGTTGACCCGTCATTCTTGTTTC 

5119 NDEI, 

LeuLysLeuThrProIleAlaAlaAlaGlyGlnLeuAspLeuSerGlyTrpPheThrAla 
5162 CTCAAACTCACTCCAATAGCGGCCGCTGGCCAGCTGGACTTGTCCGGCTGGTTCACGGCT 
GAGTTTGAGTGAGGTTATCGCCGGCGACCGGTCGACCTGAACAGGCCGACCAAGTGCCGA 

5180 NOTI, 5181 EAG1 XMA3, 5188 BALI, 5192 PVU2, 

GlyTyrSerGlyGlyAspIleTyrHisSerValSerHisAlaArgProArgTrpIleTrp 
5222 GGCTACAGCGGGGGAGACATTTATCACAGCGTGTCTCATGCCCGGCCCCGCTGGATCTGG 
CCGATGTCGCCCCCTCTGTAAATAGTGTCGCACAGAGTACGGGCCGGGGCGACCTAGACC 

5246 DRA3, 

PheCysLeuLeuLeuLeuAlaAlaGlyValGlylleTyrLeuLeuProAsnArgOP 
5282 TTTTGCCTACTCCTGCTTGCTGCAGGGGTAGGCATCTACCTCCTCCCCAACCGATGAAGG 
AAAACGGATGAGGACGAACGACGTCCCCATCCGTAGATGGAGGAGGGGTTGGCTACTTCC 

5301 PSTI, 5331 HGIE2, 



5342 TTGGGGTAAACACTCCGGCCTAAAAAAAAAAAAAAATCTAGAACCCGAGTCGAC 
AACCCCATTTGTGAGGCCGGATTTTTTTTTTTTTTTAGATCTTGGGCTCAGCTG 

5378 XBAI, 5390 SALI, 
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MetAlaAlaTyrAlaAlaGlnGlyTysrLysValLeuValLeuAsn 
2 AGCTTACAAAACAAAATGGCTGCATATGCAGCTCAGGGCTATAAGGTGCTAGTACTCAAC 
TCGAATGTTTTGTTTTACCGACGTATACGTCGAGTCCCGATATTCCACGATCATGAGTTG 

1 HIND3, 24 NDEI, 52 SCAI, 

ProSerValAlaAlaThrLeuGlyPheGlyAlaTyrMetSerLysAlaHisGlylleAsp 
62 CCCTCTGTTGCTGCAACACTGGGCTTTGGTGCTTACATGTCCAAGGCTCATGGGATCGAT 
GGGAGACAACGACGTTGTGACCCGAAACCACGAATGTACAGGTTCCGAGTACCCTAGCTA 

116 CLAI, 

ProAsnlleArgThrGlyValArgThrlleThrThrGlySerProIleThrTyrSerThr 
122 CCTAACATCAGGACCGGGGTGAGAACAATTACCACTGGCAGCCCCATCACGTACTCCACC 
GGATTGTAGTCCTGGCCCCACTCTTGTTAATGGTGACCGTCGGGGTAGTGCATGAGGTGG 

TyrGlyLysPheLeuAlaAspGlyGlyCysSerGlyGlyAlaTyrAspIlellelleCys 
182 T ACGGCAAGTTCCTTGCCGACGGCGGGTGCTCGGGGGGCGCTTaTGACATAATAATTTGT 
ATGCCGTTCAAGGAACGGCTGCCGCCCACGAGCCCCCCGCGAATACTGTATTATTAAACA 

AspGluCysHisSerThrAspAlaThrSerlleLeuGlylleGlyThrValLeuAspGln 
24 2 GACGAGTGCCACTCCACGGATGCCACATCCATCTTGGGCATTGGCACTGTCCTTGACCAA 
CTGCTCACGGTGAGGTGCCTACGGTGTAGGTAGAACCCGTAACCGTGACAGGAACTGGTT 

AlaGluThrAlaGlyAlaArgLeuValValLeuAlaThrAlaThrProProGlySerVal 
302 GCAGAGACTGCGGGGGCGAGACTGGTTGTGCTCGCCACCGCCACCCCTCCGGGCTCCGTC 
CGTCTCTGACGCCCCCGCTCTGACCAACACGAGCGGTGGCGGTGGGGAGGCCCGAGGCAG 

303 ALWN1 , 

ThrValProHisProAsnlleGluGluValAlaLeuSerThrThrGlyGluIleProPhe 
362 ACTGTGCCCCATCCCAACATCGAGGAGGTTGCTCTGTCCACCACCGGAGAGATCCCTTTT 
TGACACGGGGTAGGGTTGTAGCTCCTCCAACGAGACAGGTGGTGGCCTCTCTAGGGAAAA 

TyrGlyLysAlalleProLeuGluVallleLysGlyGlyArgHisLeuIlePheCysHis 
4 22 TACGGCAAGGCTATCCCCCTCGAAGTAATCAAGGGGGGGAGACATCTCATCTTCTGTCAT 
ATGCCGTTCCGATAGGGGGAGCTTCATTAGTTCCCCCCCTCTGTAGAGTAGAAGACAGTA 

SerLysLysLysCysAspGluLeuAlaAlaLysLeuValAlaLeuGlylleAsnAlaVal 
4 82 TCAAAGAAGAAGTGCGACGAACTCGCCGCAAAGCTGGTCGCATTGGGCATCAATGCCGTG 
AGTTTCTTCTTCACGCTGCTTGAGCGGCGTTTCGACCAGCGTAACCCGTAGTTACGGCAC 

AlaTyrTyrArgGlyLeuAspValSerVallleProThrSerGlyAspValValValVal 
542 GCCTACTACCGCGGTCTTGACGTGTCCGTCATCCCGACCAGCGGCGATGTTGTCGTCGTG 
CGGATGATGGCGCCAGAACTGCACAGGCAGTAGGGCTGGTCGCCGCTACAACAGCAGCAC 

550 SAC2, 560 DRD1 , 

- AlaThrAspAlaLeuMetThrGlyTyrThrGlyAspPheAspSerVallleAspCysAsn 
602 GCAACCGATGCCCTCATGACCGGCT ATACCGGCGACTTCGACTCGGTGAT AGACTGCAAT 
CGTTGGCTACGGGAGTACTGGCCGATATGGCCGCTGAAGCTGAGCCACTATCTGACGTTA 

615 BSPH1, 

ThrCysValThrGlnThrValAspPheSerLeuAspProThrPheThrlleGluThrlle 
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662 ACGTGTGTCACCCAGACAGTCGATTTCAGCCTTGACCCTACCTTCACCATTGAGACAATC 
TGCACACAGTGGGTCTGTCAGCTAAAGTCGGAACTGGGATGGAAGTGGTAACTCTGTTAG 

ThrLeuProGlnAspAlaValSerArgThrGlnArgArgGlyArgThrGlyArgGlyLys 
722 ACGCTCCCCCAAGATGCTGTCTCCCGCACTCAACGTCGGGGCAGGACTGGCAGGGGGAAG 
TGCGAGGGGGTTCTACGACAGAGGGCGTGAGTTGCAGCCCCGTCCTGACCGTCCCCCTTC 

ProGlylleTyrArgPheValAlaProGlyGluArgProSerGlyMetPheAspSerSer 
782 CCAGGCATCTACAGATTTGTGGCACCGGGGGAGCGCCCCTCCGGCATGTTCGACTCGTCC 
GGTCCGTAGATGTCTAAACACCGTGGCCCCCTCGCGGGGAGGCCGTACAAGCTGAGCAGG 

816 BGLI, 833 DRD1 , 

ValLeuCysGluCysTyrAspAlaGlyCysAlaTrpTyrGluLeuThrProAlaGluThr 
84 2 GTCCTCTGTGAGTGCTATGACGCAGGCTGTGCTTGGTATGAGCTCACGCCCGCCGAGACT 
CAGGAGACACTCACGATACTGCGTCCGACACGAACCATACTCGAGTGCGGGCGGCTCTGA 

881 SACI, 

ThrValArgLeuArgAlaTyrMetAsnThrProGlyLeuProValCysGlnAspHisLeu 
902 ACAGTTAGGCTACGAGCGTACATGAACACCCCGGGGCTTCCCGTGTGCCAGGACCATCTT 
TGTCAATCCGATGCTCGCATGTACTTGTGGGGCCCCGAAGGGCACACGGTCCTGGTAGAA 

931 SMAI XMAI, 

GluPheTrpGluGlyValPheThrGlyLeuThrHisIleAspAlaHisPheLeuSerGln 
962 GAATTTTGGGAGGGCGTCTTTACAGGCCTCACTCATATAGATGCCCACTTTCTATCCCAG 
CTTAAAACCCTCCCGCAGAAATGTCCGGAGTGAGTATATCTACGGGTGAAAGATAGGGTC 

985 STUI, 

ThrLysGlnSerGlyGluAsnLeuProTyrLeuValAlaTyrGlnAlaThrValCysAla 
1022 ACAAAGCAGAGTGGGGAGAACCTTCCTTACCTGGTAGCGTACCAAGCCACCGTGTGCGCT 
TGTTTCGTCTCACCCCTCTTGGAAGGAATGGACCATCGCATGGTTCGGTGGCACACGCGA 

1069 DRA3, 

ArgAlaGlnAlaProProProSerTrpAspGlnMetTrpLysCysLeuIleArgLeuLys 
1082 AGGGCTCAAGCCCCTCCCCCATCGTGGGACCAGATGTGGAAGTGTTTGATTCGCCTCAAG 
TCCCGAGTTCGGGGAGGGGGTAGCACCCTGGTCTACACCTTCACAAACTAAGCGGAGTTC 

ProThrLeuHisGlyProThrProLeuLeuTyrArgLeuGlyAlaValGlnAsnGluIle 
1142 CCCACCCTCCATGGGCCAACACCCCTGCTATACAGACTGGGCGCTGTTCAGAATGAAATC 
GGGTGGGAGGTACCCGGTTGTGGGGACGATATGTCTGACCCGCGACAAGTCTTACTTTAG 

1150 NCOI, 

. ' ThrLeuThrHisProValThrLysTyrlleMetThrCysMetSerAlaAspLeuGluVal 
1 20 2 ACCCTGACGCACCCAGTCACCAAATACATCATGACATGCATGTCGGCCGACCTGGAGGTC 
i" . TGGGACTGCGTGGGTCAGTGGTTTATGTAGTACTGTACGTACAGCCGGCTGGACCTCCAG 

1230 BSPH1, 1234 DRD1 , 1237 AVA3 , 1245 EAG1 XMA3, 1250 DRD1, 



ValThrSerThrTrpValLeuValGlyGlyValLeuAlaAlaLeuAlaAlaTyrCysLeu 
1262 GTCACGAGCACCTGGGTGCTCGTTGGCGGCGTCCTGGCTGCTTTGGCCGCGTATTGCCTG 
CAGTGCTCGTGGACCCACGAGCAACCGCCGCAGGACCGACGAAACCGGCGCATAACGGAC 
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S e r Thr G 1 yCysValValll eVa 1 G 1 yAr g Va 1 Va 1 LeuS e rGl y Lys Pr oAl a 1 1 e 1 1 e 
1322 TCAACAGGCTGCGTGGTCATAGTGGGCAGGGTCGTCTTGTCCGGGAAGCCGGCAATCATA 
AGTTGTCCGACGCACCAGTATCACCCGTCCCAGCAGAACAGGCCCTTCGGCCGTTAGTAT 

1369 NAEI, 

ProAspArgGluValLeuTyrArgGluPheAspGluMetGluGluCysSerGlnHisLeu 
1382 CCTGACAGGGAAGTCCTCTACCGAGAGTTCGATGAGATGGAAGAGTGCTCTCAGCACTTA 
GGACTGTCCCTTCAGGAGATGGCTCTCAAGCTACTCTACCTTCTCACGAGAGTCGTGAAT 

1385 DRD1, 

ProTyrtleGluGlnGlyMetMetLeuAlaGluGlnPheLysGlnLysAlaLeuGlyLeu 
14 4 2 CCGTACATCGAGCAAGGGATGATGCTCGCCGAGCAGTTCAAGCAGAAGGCCCTCGGCCTC 
GGCATGTAGCTCGTTCCCTACTACGAGCGGCTCGTCAAGTTCGTCTTCCGGGAGCCGGAG 

LeuGlnThrAlaSerArgGlnAlaGluVallleAlaProAlaValGlnThrAsnTrpGln 

1502 CTGCAGACCGCGTCCCGTCAGGCAGAGGTTATCGCCCCTGCTGTCCAGACCAACTGGCAA 
GACGTCTGGCGCAGGGCAGTCCGTCTCCAATAGCGGGGACGACAGGTCTGGTTGACCGTT 

1502 PSTI, 1507 TTH3I, 

LysLeuGluThrPheTrpAlaLysHisMetTrpAsnPhelleSerGlylleGlnTyrLeu 
1562 AAACTCGAGACCTTCTGGGCGAAGCATATGTGGAACTTCATCAGTGGGATACAATACTTG 
TTTGAGCTCTGGAAGACCCGCTTCGTATACACCTTGAAGTAGTCACCCTATGTTATGAAC 

1565 XHOI, 1586 NDEI, 

AlaGlyLeuSerThrLeuProGlyAsnProAlalleAlaSerLeuMetAlaPheThrAla 
1622 GCGGGCTTGTCAACGCTGCCTGGTAACCCCGCCATTGCTTCATTGATGGCTTTTACAGCT 
CGCCCGAACAGTTGCGACGGACCATTGGGGCGGTAACGAAGTAACTACCGAAAATGTCGA 

164 3 BSTE2 , 167 7 ALWN1 PVU2 , 

AlaValThrSerProLeuThrThrSerGlnThrLeuLeuPheAsnlleLeuGlyGlyTrp 
1682 GCTGTCACCAGCCCACTAACCACTAGCCAAACCCTCCTCTTCAACATATTGGGGGGGTGG 
CGACAGTGGTCGGGTGATTGGTGATCGGTTTGGGAGGAGAAGTTGTATAACCCCCCCACC 

ValAlaAlaGlnLeuAlaAlaProGlyAlaAlaThrAlaPheValGlyAlaGlyLeuAla 
1742 GTGGCTGCCCAGCTCGCCGCCCCCGGTGCCGCTACTGCCTTTGTGGGCGCTGGCTTAGCT 
CACCGACGGGTCGAGCGGCGGGGGCCACGGCGATGACGGAAACACCCGCGACCGAATCGA 

1794 ESP1, 

GlyAlaAlalleGlySerValGlyLeuGlyLysValLeuIleAspIleLeuAlaGlyTyr 
1802 GGCGCCGCCATCGGCAGTGTTGGACTGGGGAAGGTCCTCATAGACATCCTTGCAGGGTAT 
.- " CCGCGGCGGTAGCCGTCACAACCTGACCCCTTCCAGGAGTATCTGTAGGAACGTCCCATA 

V 1802 KAS1 NARI, 

GlyAlaGlyValAlaGlyAlaLeuValAlaPheLysIleMetSerGlyGluValProSer 
"•<<! 8 62 . GGCGCGGGCGTGGCGGGAGCTCTTGTGGCATTCAAGATCATGAGCGGTGAGGTCCCCTCC 
' CCGCGCCCGCACCGCCCTCGAGAACACCGTAAGTTCTAGTACTCGCCACTCCAGGGGAGG 

1878 SACI, 1899 BSPH1, 
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ThrGluAspLeuValAsnLeuLeuProAlalleLeuSerProGlyAlaLeuValValGly 
1922 ACGGAGGACCTGGTCAATCTACTGCCCGCCATCCTCTCGCCCGGAGCCCTCGTAGTCGGC 
TGCCTCCTGGACCAGTTAGATGACGGGCGGTAGGAGAGCGGGCCTCGGGAGCATCAGCCG 

1928 TTH3I, 

ValValCysAlaAlaXleLeuArgArgHisValGlyProGlyGluGlyAlaValGlnTrp 
1982 GTGGTCTGTGCAGCAATACTGCGCCGGCACGTTGGCCCGGGCGAGGGGGCAGTGCAGTGG 
CACCAGACACGTCGTTATGACGCGGCCGTGCAACCGGGCCCGCTCCCCCGTCACGTCACC 

2004 NAEI , 2017 SMAI XMAI, 

MetAsnArgLeuIleAlaPheAlaSerArgGlyAsnHisValSerProThrHisTyrVal 
204 2 ATGAACCGGCTGATAGCCTTCGCCTCCCGGGGGAACCATGTTTCCCCCACGCACTACGTG 
TACTTGGCCGACTATCGGAAGCGGAGGGCCCCCTTGGTACAAAGGGGGTGCGTGATGCAC 

2067 SMAI XMAI, 2093 DRA3, 

ProGluSerAsDAlaAlaAlaArgValThrAlalleLeuSsrSsrLeuThrValThirGln 
2102 CCGGAGAGCGATGCAGCTGCCCGCGTCACTGCCATACTCAGCAGCCTCACTGTAACCCAG 
GGCCTCTCGCTACGTCGACGGGCGCAGTGACGGTATGAGTCGTCGGAGTGACATTGGGTC 

2115 PVU2, 2159 ALWN1, 

LeuLeuArgArgLeuHisGlnTrpIleSerSerGluCysThrThrProCysSerGlySer 
2162 CTCCTGAGGCGACTGCACCAGTGGATAAGCTCGGAGTGTACCACTCCATGCTCCGGTTCC 
GAGGACTCCGCTGACGTGGTCACCTATTCGAGCCTCACATGGTGAGGTACGAGGCCAAGG 

2164 MST2 , 2220 ECON1, 

TrpLeuArgAspIleTrpAspTrpIleCysGluValLeuSerAspPheLysThrTrpLeu 
2222 TGGCTAAGGGACATCTGGGACTGGATATGCGAGGTGTTGAGCGACTTTAAGACCTGGCTA 
ACCGATTCCCTGTAGACCCTGACCTATACGCTCCACAACTCGCTGAAATTCTGGACCGAT 

LysAlaLysLeuMetProGlnLeuProGlylleProPheValSerCysGlnArgGlyTyr 
2282 AAAGCTAAGCTCATGCCACAGCTGCCTGGGATCCCCTTTGTGTCCTGCCAGCGCGGGTAT 
TTTCGATTCGAGTACGGTGTCGACGGACCCTAGGGGAAACACAGGACGGTCGCGCCCATA 

2285 ESP1, 2300 PVU2, 2310 BAMHI, 

LysGlyValTrpArgGlyAspGlylleMetHisThrArgCysHisCysGlyAlaGluIle 
2342 AAGGGGGTCTGGCGAGGGGACGGCATCATGCACACTCGCTGCCACTGTGGAGCTGAGATC 
TTCCCCCAGACCGCTCCCCTGCCGTAGTACGTGTGAGCGACGGTGACACCTCGACTCTAG 

ThrGlyHisValLysAsnGlyThrMetArglleValGlyProArgThrCysArgAsnMet 
2 4 02 ACTGGACATGTCAAAAACGGGACGATGAGGATCGTCGGTCCTAGGACCTGCAGGAACATG 
TGACCTGTACAGTTTTTGCCCTGCTACTCCTAGCAGCCAGGATCCTGGACGTCCTTGTAC 

2425 BSAB1, 2441 AVR2, 2448 SSE83871, 2449 PSTX, 

TrpSerGlyThrPheProIleAsnAlaTyrThrThrGlyProCysThrProL uProAla 
2462 , TGGAGTGGGACCTTCCCCATTAATGCCTACACCACGGGCCCCTGTACCCCCCTTCCTGCG 
ACCTCACCCTGGAAGGGGTAATTACGGATGTGGTGCCCGGGGACATGGGGGGAAGGACGC 

2480 ASE1, 2497 APAI, 

ProAsnTyrThrPheAlaLeuTrpArgValSerAXaGluGluTyrValGXuIleArgGln 
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2522 CCGAACTACACGTTCGCGCTATGGAGGGTGTCTGCAGAGGAATACGTGGAGATAAGGCAG 
GGCTTGATGTGCAAGCGCGATACCTCCCACAGACGTCTCCTTATGCACCTCTATTCCGTC 

2553 PSTI, 

ValGlyAspPheHisTyrValThrGlyMetThrThrAspAsnLeuLysCysProCysGln 
2582 GTGGGGGACTTCCACTACGTGACGGGTATGACTACTGACAATCTTAAATGCCCGTGCCAG 
CACCCCCTGAAGGTGATGCACTGCCCATACTGATGACTGTTAGAATTTACGGGCACGGTC 

2594 DRA3, 

ValProSerProGluPhePheThrGluLeuAspGlyValArgLeuHisArgPheAlaPro 
264 2 GTCCCATCGCCCGAATTTTTCACAGAATTGGACGGGGTGCGCCTACATAGGTTTGCGCCC 
CAGGGTAGCGGGCTTAAAAAGTGTCTTAACCTGCCCCACGCGGATGTATCCAAACGCGGG 

ProCysLysProLeuLeuArgGluGluValSerPheArgValGlyLeuHisGluTyrPro 
27 02 CCCTGCAAGCCCTTGCTGCGGGAGGAGGTATCATTCAGAGTAGGACTCCACGAATACCCG 
GGGACGTTCGGGAACGACGCCCTCCTCCATAGTAAGTCTCATCCTGAGGTGCTTATGGGC 

2757 HGIE2, 

ValGlySerGlnLeuProCysGluProGluProAspValAlaValLeuThrSerMetLeu 
27 62 GTAGGGTCGCAATTACCTTGCGAGCCCGAACCGGACGTGGCCGTGTTGACGTCCATGCTC 
CATCCCAGCGTTAATGGAACGCTCGGGCTTGGCCTGCACCGGCACAACTGCAGGTACGAG 

2809 AAT2, 

ThrAspProSerHisIleThrAlaGluAlaAlaGlyArgArgLeuAlaArgGlySerPro 
2822 ACTGATCCCTCCCATATAACAGCAGAGGCGGCCGGGCGAAGGTTGGCGAGGGGATCACCC 
TGACTAGGGAGGGTATATTGTCGTCTCCGCCGGCCCGCTTCCAACCGCTCCCCTAGTGGG 

28 50 EAG1 XMA3, 

ProSerValAlaSerSerSerAlaSerGlnLeuSerAlaProSerLeuLysAlaThrCys 
2882 CCCTCTGTGGCCAGCTCCTCGGCTAGCCAGCTATCCGCTCCATCTCTCAAGGCAACTTGC 
GGGAGACACCGGTCGAGGAGCCGATCGGTCGATAGGCGAGGTAGAGAGTTCCGTTGAACG 

2889 BALI, 2903 NHEI, 

ThrAlaAsnHisAspSerProAspAlaGluLeuIleGluAlaAsnLeuLeuTrpArgGln 
2 942 ACCGCTAACCATGACTCCCCTGATGCTGAGCTCATAGAGGCCAACCTCCTATGGAGGCAG 
TGGCGATTGGTACTGAGGGGACTACGACTCGAGTATCTCCGGTTGGAGGATACCTCCGTC 

2966 ESP1, 2969 SACI, 

GluMetGlyGlyAsnlleThrArgValGluSerGluAsnLysValVallleLeuAspSer 
3002 - GAGATGGGCGGCAACATCACCAGGGTTGAGTCAGAAAACAAAGTGGTGATTCTGGACTCC 
CTCTACCCGCCGTTGTAGTGGTCCCAACTCAGTCTTTTGTTTCACCACTAAGACCTGAGG 

PheAspProLeuValAlaGluGluAspGluArgGluIleSerValProAlaGluIleLeu 
3062 , TTCGATCCGCTTGTGGCGGAGGAGGACGAGCGGGAGATCTCCGTACCCGCAGAAATCCTG 
AAGCTAGGCGAACACCGCCTCCTCCTGCTCGCCCTCTAGAGGCATGGGCGTCTTTAGGAC 

3096 BGL2, 

ArgLysSerArgArgPheAlaGlnAlaLeuProValTrpAlaArgProAspTyrAsnPro 
3122 CGGAAGTCTCGGAGATTCGCCCAGGCCCTGCCCGTTTGGGCGCGGCCGGACTATAACCCC 
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GCCTTCAGAGCCTCTAAGCGGGTCCGGGACGGGCAAACCCGCGCCGGCCTGATATTGGGG 
3143 ALWN1, 3164 EAG1 XMA3, 

ProLeuValGluThrTrpLysLysProAspTyrGluProProValValHisGlyCysPro 
3182 CCGCTAGTGGAGACGTGGAAAAAGCCCGACTACGAACCACCTGTGGTCCATGGCTGCCCG 
GGCGATCACCTCTGCACCTTTTTCGGGCTGATGCTTGGTGGACACCAGGTACCGACGGGC 

3217 HGIE2 , 3229 NCOI, 

LeuProProProLysSerProProValProProProArgLysLysArgThrValValLeu 
324 2 CTTCCACCTCCAAAGTCCCCTCCTGTGCCTCCGCCTCGGAAGAAGCGGACGGTGGTCCTC 
GAAGGTGGAGGTTTCAGGGGAGGACACGGAGGCGGAGCCTTCTTCGCCTGCCACCAGGAG 

ThrGluSerThrLeuSerThrAlaLeuAlaGluLeuAlaThrArgSerPheGlySerSer 
3302 ACTGAATCAACCCTATCTACTGCCTTGGCCGAGCTCGCCACCAGAAGCTTTGGCAGCTCC 
TGACTTAGTTGGGATAGATGACGGAACCGGCTCGAGCGGTGGTCTTCGAAACCGTCGAGG 

3332 SACI, 334 6 HIND3, 

SerThrSerGlylleThrGlyAspAsnThrThrThrSerSerGluProAlaProSerGly 
3362 TCAACTTCCGGCATTACGGGCGACAATACGACAACATCCTCTGAGCCCGCCCCTTCTGGC 
AGTTGAAGGCCGTAATGCCCGCTGTTATGCTGTTGTAGGAGACTCGGGCGGGGAAGACCG 

CysProProAspSerAspAlaGluSerTyrSerSerMetProProLeuGluGlyGluPro 
3422 TGCCCCCCCGACTCCGACGCTGAGTCCTATTCCTCCATGCCCCCCCTGGAGGGGGAGCCT 
ACGGGGGGGCTGAGGCTGCGACTCAGGATAAGGAGGTACGGGGGGGACCTCCCCCTCGGA 

3437 EAM11051, 

GlyAspProAspLeuSerAspGlySerTrpSerThrValSerSerGluAlaAsnAlaGlu 
34 82 GGGGATCCGGATCTTAGCGACGGGTCATGGTCAACGGTCAGTAGTGAGGCCAACGCGGAG 
CCCCTAGGCCTAGAATCGCTGCCCAGTACCAGTTGCCAGTCATCACTCCGGTTGCGCCTC 

3484 BAMHI, 3485 BSAB1 , 3487 BSPE1, 

AspValValCysCysSerMetSerTyrSerTrpThrGlyAlaLeuValThrProCysAla 
3542 GATGTCGTGTGCTGCTCAATGTCTTACTCTTGGACAGGCGCACTCGTCACCCCGTGCGCC 
CTACAGCACACGACGAGTTACAGAATGAGAACCTGTCCGCGTGAGCAGTGGGGCACGCGG 

3589 DRA3, 3600 SAC2, 

AlaGluGluGlnLysLeuProIleAsnAlaLeuSerAsnSerLeuLeuArgHisHisAsn 
3602 GCGGAAGAACAGAAACTGCCCATCAATGCACTAAGCAACTCGTTGCTACGTCACCACAAT 
CGCCTTCTTGTCTTTGACGGGTAGTTACGTGATTCGTTGAGCAACGATGCAGTGGTGTTA 

3611 ALWN1, 3655 PFLM1, 

LeuValTyrSerThrThrSerArgSerAlaCysGlnArgGlnLysLysValThrPheAsp 
; ' 3662 TTGGTGTATTCCACCACCTCACGCAGTGCTTGCCAAAGGCAGAAGAAAGTCACATTTGAC 
. AACCACATAAGGTGGTGGAGTGCGTCACGAACGGTTTCCGTCTTCTTTCAGTGTAAACTG 

3681 DRA3, 

ArgLeuGlnValLeuAspS rHisTyrGlnAspValLeuLysGluValLysAlaAlaAla 
3722 AGACTGCAAGTTCTGGACAGCCATTACCAGGACGTACTCAAGGAGGTTAAAGCAGCGGCG 
TCTGACGTTCAAGACCTGTCGGTAATGGTCCTGCATGAGTTCCTCCAATTTCGTCGCCGC 
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SerLysValLysAlaAsnLeu&euSerValGluGluAlaCysSerLeuThrProProHis 
3782 TCAAAAGTGAAGGCTAACTTGCTATCCGTAGAGGAAGCTTGCAGCCTGACGCCCCCACAC 
AGTTTTCACTTCCGATTGAACGATAGGCATCTCCTTCGAACGTCGGACTGCGGGGGTGTG 

3816 HIND3, 

SerAlaLysSerLysPheGlyTyrGlyAlaLysAspValArgCysHisAlaArgLysAla 
38 4 2 TCAGCCAAATCCAAGTTTGGTTATGGGGCAAAAGACGTCCGTTGCCATGCCAGAAAGGCC 
AGTCGGTTTAGGTTCAAACCAATACCCCGTTTTCTGCAGGCAACGGTACGGTCTTTCCGG 

3875 AAT2, 3890 BGLI , 

ValThrHisIleAsnSerValTrpLysAspLeuLeuGluAspAsnValThrProIleAsp 
3902 GTAACCCACATCAACTCCGTGTGGAAAGACCTTCTGGAAGACAATGTAACACCAATAGAC 
CATTGGGTGTAGTTGAGGCACACCTTTCTGGAAGACCTTCTGTTACATTGTGGTTATCTG 

ThrThrlleMetAlaLysAsnGluValPheCysValGlnProGluLysGlyGlyArgLys 
3962 ACTACCATCATGGCTAAGAACGAGGTTTTCTGCGTTCAGCCTGAGAAGGGGGGTCGTAAG 
TGATGGTAGTACCGATTCTTGCTCCAAAAGACGCAAGTCGGACTCTTCCCCCCAGCATTC 

ProAlaArgLeuIleValPheProAspLeuGlyValArgValCysGluLysMetAlaLeu 
4 022 CCAGCTCGTCTCATCGTGTTCCCCGATCTGGGCGTGCGCGTGTGCGAAAAGATGGCTTTG 
GGTCGAGCAGAGTAGCACAAGGGGCTAGACCCGCACGCGCACACGCTTTTCTACCGAAAC 

TyrAspValValThrLysLeuProLeuAlaValMetGlySerSerTyrGlyPheGlnTyr 
4 082 TACGACGTGGTTACAAAGCTCCCCTTGGCCGTGATGGGAAGCTCCTACGGATTCCAATAC 
ATGCTGCACCAATGTTTCGAGGGGAACCGGCACTACCCTTCGAGGATGCCTAAGGTTATG 

SerProGlyGlnArgValGluPheLeuValGlnAlaTrpLysSerLysLysThrProMet 
4 142 TCACCAGGACAGCGGGTTGAATTCCTCGTGCAAGCGTGGAAGTCCAAGAAAACCCCAATG 
AGTGGTCCTGTCGCCCAACTTAAGGAGCACGTTCGCACCTTCAGGTTCTTTTGGGGTTAC 

4160 ECORI, 

Gly PheSerTyrAspThrArgCys PheAspSerThrValThrGluSerAspI leArgThr 
4202 GGGTTCTCGTATGATACCCGCTGCTTTGACTCCACAGTCACTGAGAGCGACATCCGTACG 
CCCAAGAGCATACTATGGGCGACGAAACTGAGGTGTCAGTGACTCTCGCTGTAGGCATGC 

4229 DRD1, 4236 ALWN1, 

GluGluAlalleTyrGlnCysCysAspLeuAspProGlnAlaArgValAlalleLysSer 
4 262 GAGGAGGCAATCTACCAATGTTGTGACCTCGACCCCCAAGCCCGCGTGGCCATCAAGTCC 
CTCCTCCGTTAGATGGTTACAACACTGGAGCTGGGGGTTCGGGCGCACCGGTAGTTCAGG 

4 301 BGLI, 4 308 BALI, 

. LeuThrGluArgLeuTyrValGlyGlyProLeuThrAsnSerArgGlyGluAsnCysGly 
4 322 CTCACCGAGAGGCTTTATGTTGGGGGCCCTCTTACCAATTCAAGGGGGGAGAACTGCGGC 
GAGTGGCTCTCCGAAATACAACCCCCGGGAGAATGGTTAAGTTCCCCCCTCTTGACGCCG 

- 4345 APAI, 

TyrArgArgCysArgAlaSerGlyValLeuThrThrSerCysGlyAsnThrLeuThrCys 
4 382 TATCGCAGGTGCCGCGCGAGCGGCGTACTGACAACTAGCTGTGGTAACACCCTCACTTGC 
ATAGCGTCCACGGCGCGCTCGCCGCATGACTGTTGATCGACACCATTGTGGGAGTGAACG 
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TyrXleLysAlaArgAlaAlaCysArgAlaAlaGlyLeuGlnAspCysThrMetLeuVal 
4442 TACATCAAGGCCCGGGCAGCCTGTCGAGCCGCAGGGCTCCAGGACTGCACCATGCTCGTG 
ATGTAGTTCCGGGCCCGTCGGACAGCTCGGCGTCCCGAGGTCCTGACGTGGTACGAGCAC 

4452 SMAX XMAI, 

CysGlyAspAspLeuValVallleCysGluSerAlaGlyValGlnGluAspAlaAlaSer 
4 502 TGTGGCGACGACTTAGTCGTTATCTGTGAAAGCGCGGGGGTCCAGGAGGACGCGGCGAGC 
ACACCGCTGCTGAATCAGCAATAGACACTTTCGCGCCCCCAGGTCCTCCTGCGCCGCTCG 

4508 DRD1, 4511 TTH3I, 

LeuArgAlaPheThrGluAlaMetThrArgTyrSerAlaProProGlyAspProProGln 
4 562 CTGAGAGCCTTCACGGAGGCTATGACCAGGTACTCCGCCCCCCCTGGGGACCCCCCACAA 
GACTCTCGGAAGTGCCTCCGATACTGGTCCATGAGGCGGGGGGGACCCCTGGGGGGTGTT 

ProGluTyrAspLeuGluLeuIleThrSerCysSerSerAsnValSerValAlaHisAsp 
4 622 CCAGAATACGACTTGGAGCTCATAACATCATGCTCCTCCAACGTGTCAGTCGCCCACGAC 
GGTCTTATGCTGAACCTCGAGTATTGTAGTACGAGGAGGTTGCACAGTCAGCGGGTGCTG 

4637 SACI, 

GlyAlaGlyLysArgValTyrTyrLeuThrArgAspProThrThrProLeuAlaArgAla 
4 682 GGCGCTGGAAAGAGGGTCTACTACCTCACCCGTGACCCTACAACCCCCCTCGCGAGAGCT 
CCGCGACCTTTCTCCCAGATGATGGAGTGGGCACTGGGATGT.TGGGGGGAGCGCTCTCGA 

4731 NRUI, 

AlaTrpGluThrAlaArgHisThrProValAsnSerTrpLeuGlyAsnllelleMetPhe 
47 42 GCGTGGGAGACAGCAAGACACACTCCAGTCAATTCCTGGCTAGGCAACATAATCATGTTT 
CGCACCCTCTGTCGTTCTGTGTGAGGTCAGTTAAGGACCGATCCGTTGTATTAGTACAAA 

AlaProThrLeuTrpAlaArgMetlleLeuMetThrHisPhePheSerValLeuIleAla 
4 802 GCCCCCACACTGTGGGCGAGGATGATACTGATGACCCATTTCTTTAGCGTCCTTATAGCC 
CGGGGGTGTGACACCCGCTCCTACTATGACTACTGGGTAAAGAAATCGCAGGAATATCGG 

4806 PFLM1, 4807 DRA3, 

ArgAspGlnLeuGluGlnAlaLeuAspCysGluIleTyrGlyAlaCysTyrSerlleGlu 
4862 AGGGACCAGCTTGAACAGGCCCTCGATTGCGAGATCTACGGGGCCTGCTACTCCATAGAA 
TCCCTGGTCGAACTTGTCCGGGAGCTAACGCTCTAGATGCCCCGGACGATGAGGTATCTT 

4 893 BGL2, 

ProLeuAspLeuProProXlelleGlnArgLeuHisGlyLeuSerAlaPheSerLeuHis 
4 922 CCACTGGATCTACCTCCAATCATTCAAAGACTCCATGGCCTCAGCGCATTTTCACTCCAC 
GGTGACCTAGATGGAGGTTAGTAAGTTTCTGAGGTACCGGAGTCGCGTAAAAGTGAGGTG 

4954 NCOI, 

SerTyrSerProGlyGluXleAsnArgValAlaAlaCysLeuArgLysLeuGlyValPro 
4 982 AGTTACTCTCCAGGTGAAATCAATAGGGTGGCCGCATGCCTCAGAAAACTTGGGGTACCG 
TCAATGAGAGGTCCACTTTAGTTATCCCACCGGCGTACGGAGTCTTTTGAACCCCATGGC 

5015 SPHI, 5035 KPNX, 

ProLeuArgAlaTrpAxgHisArgAlaArgSerValAxgAlaArgLeuLeuAlaArgGly 
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5042 CCCTTGCGAGCTTGGAGACACCGGGCCCGGAGCGTCCGCGCTAGGCTTCTGGCCAGAGGA 
GGGAACGCTCGAACCTCTGTGGCCCGGGCCTCGCAGGCGCGATCCGAAGACCGGTCTCCT 

5064 APAI, 5091 BALI, 

GlyAxgAiaAlalieCysGiyLysTyrLeuPheAsnTrpAlaValArgThrLysLeuLys 
5102 GGCAGGGCTGCCATATGTGGCAAGTACCTCTTCAACTGGGCAGTAAGAACAAAGCTCAAA 
CCGTCCCGACGGTATACACCGTTCATGGAGAAGTTGACCCGTCATTCTTGTTTCGAGTTT 

5113 NDEI, 

LeuThrProIleAlaAlaAlaGlyGlnLeuAspLeuSerGlyTrpPheThrAlaGlyTyr 
5162 CTCACTCCAATAGCGGCCGCTGGCCAGCTGGACTTGTCCGGCTGGTTCACGGCTGGCTAC 
GAGTGAGGTTATCGCCGGCGACCGGTCGACCTGAACAGGCCGACCAAGTGCCGACCGATG 

5174 NOTI, 5175 EAG1 XMA3, 5182 BALI, 5186 PVU2, 

SerGlyGlyAspIleTyrHisSerValSerHisAlaArgProArgTrpIleTrpPheCys 
5222 AGCGGGGGAGACATTTATCACAGCGTGTCTCATGCCCGGCCCCGCTGGATCTGGTTTTGC 
TCGCCCCCTCTGTAAATAGTGTCGCACAGAGTACGGGCCGGGGCGACCTAGACCAAAACG 

5240 DRA3, 

LeuLeuLeuLeuAlaAlaGlyValGlylleTyrLeuLeuProAsnArgOP 
5282 CTACTCCTGCTTGCTGCAGGGGTAGGCATCTACCTCCTCCCCAACCGATGAATAGTCGAC 
GATGAGGACGAACGACGTCCCCATCCGTAGATGGAGGAGGGGTTGGCTACTTATCAGCTG 

5295 PSTI, 5336 SALI, 
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MetAlaAlaTyrAlaAlaGlnGlyTyrLysValLeuValLeuAsn 
2 AGCTTACAAAACAAAATGGCTGCATATGCAGCTCAGGGCTATAAGGTGCTAGTACTCAAC 
TCGAATGTTTTGTTTTACCGACGTATACGTCGAGTCCCGATATTCCACGATCATGAGTTG 

1 HIND3, 24 NDEI, 52 SCAI, 

ProSerValAlaAlaThrLeuGlyPheGlyAlaTyrMetSerLysAlaHisGlylleAsp 
6 2 CCCTCTGTTGCTGCAACACTGGGCTTTGGTGCTTACATGTCCAAGGCTCATGGGATCGAT 
GGGAGACAACGACGTTGTGACCCGAAACCACGAATGTACAGGTTCCGAGTACCCTAGCTA 

116 CLAI, 

ProAsnlleArgThrGlyValArgThrlleThrThrGlySerProIleThrTyrSerThr 
122 CCTAACATCAGGACCGGGGTGAGAACAATTACCACTGGCAGCCCCATCACGTACTCCACC 
GGATTGTAGTCCTGGCCCCACTCTTGTTAATGGTGACCGTCGGGGTAGTGCATGAGGTGG 

TyrGlyLysPheLeuAlaAspGlyGlyCysSerGlyGlyAlaTyrAspIlellelleCys 
182 TACGGCAAGTTCCTTGCCGACGGCGGGTGCTCGGGGGGCGCTTATGACATAATAATTTGT 
ATGCCGTTCAAGGAACGGCTGCCGCCCACGAGCCCCCCGCGAATACTGTATTATTAAACA 

AspGluCysHisSerThrAspAlaThrSerlleLeuGlylleGlyThrValLeuAspGln 
242 GACGAGTGCCACTCCACGGATGCCACATCCATCTTGGGCATTGGCACTGTCCTTGACCAA 
CTGCTCACGGTGAGGTGCCTACGGTGTAGGTAGAACCCGTAACCGTGACAGGAACTGGTT 

AlaGluThrAlaGlyAlaArgLeuValValLeuAlaThrAlaThrProProGlySerVal 
302 GCAGAGACTGCGGGGGCGAGACTGGTTGTGCTCGCCACCGCCACCCCTCCGGGCTCCGTC 
CGTCTCTGACGCCCCCGCTCTGACCAACACGAGCGGTGGCGGTGGGGAGGCCCGAGGCAG 

303 ALWN1, 

ThrValProHisProAsnlleGluGluValAlaLeuSerThrThrGlyGluIleProPhe 
362 ACTGTGCCCCATCCCAACATCGAGGAGGTTGCTCTGTCCACCACCGGAGAGATCCCTTTT 
TGACACGGGGTAGGGTTGTAGCTCCTCCAACGAGACAGGTGGTGGCCTCTCTAGGGAAAA 

TyrGlyLysAlalleProLeuGluVallleLysGlyGlyArgHisLeuIlePheCysHis 
422 TACGGCAAGGCTATCCCCCTCGAAGTAATCAAGGGGGGGAGACATCTCATCTTCTGTCAT 
ATGCCGTTCCGATAGGGGGAGCTTCATTAGTTCCCCCCCTCTGTAGAGTAGAAGACAGTA 
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S«rLysLysLysCysAspGluLeuAlaAlaLysLeuValAlaLeuGlyXleAsnAlaVal 
482 TCAAAGAAGAAGTGCGACGAACTCGCCGCAAAGCTGGTCGCATTGGGCATCAATGCCGTG 
AGTTTCTTCTTCACGCTGCTTGAGCGGCGTTTCGACCAGCGTAACCCGTAGTTACGGCAC 

AlaTyrTyrArgGlyLeuAspValSerVallleProThrSerGlyAspValValValVal 
542 GCCTACTACCGCGGTCTTGACGTGTCCGTCATCCCGACCAGCGGCGATGTTGTCGTCGTG 
CGGATGATGGCGCCAGAACTGCACAGGCAGTAGGGCTGGTCGCCGCTACAACAGCAGCAC 

550 SAC2, 560 DRD1, 

AlaThrAspAlaLeuMetThrGlyTyrThrGlyAspPheAspSerValXleAspCysAsn 
602 GCAACCGATGCCCTCATGACCGGCTATACCGGCGACTTCGACTCGGTG ATAGACTGC AAT 
CGTTGGCTACGGGAGTACTGGCCGATATGGCCGCTGAAGCTGAGCCACTATCTGACGTTA 

615 BSPH1, 

ThrCysValThrGlnThrValAspPheSerLeuAspProThrPheThrlleGluThrlle 
662 ACGTGTGTCACCCAGACAGTCGATTTCAGCCTTGACCCTACCTTCACCATTGAGACAATC 
TGCACACAGTGGGTCTGTCAGCTAAAGTCGGAACTGGGATGGAAGTGGTAACTC rG 1 TAG 

ThrLeuProGlnAspAlaValSerArgThrGlnArgArgGlyArgThrGlyArgGlyLys 
722 ACGCTCCCCCAAGATGCTGTCTCCCGCACTCAACGTCGGGGCAGGACTGGCAGGGGGAAG 
TGCGAGGGGGTTCTACGACAGAGGGCGTGAGTTGCAGCCCCGTCCTGACCGTCCCCCTTC 

ProGlylleTyrArgPheValAlaProGlyGluArgProSerGlyMetPheAspSerSer 
7 82 CCAGGCATCTACAGATTTGTGGCACCGGGGGAGCGCCCCTCCGGCATGTTCGACTCGTCC 
GGTCCGTAGATGTCTAAACACCGTGGCCCCCTCGCGGGGAGGCCGTACAAGCTGAGCAGG 

816 BGLI, 833 DRD1, 

ValLeuCysGluCysTyrAspAlaGlyCysAlaTrpTyrGluLeuThrProAlaGluThr 
842 GTCCTCTGTGAGTGCTATGACGCAGGCTGTGCTTGGTATGAGCTCACGCCCGCCGAGACT 
CAGGAGACACTCACGATACTGCGTCCGACACGAACCATACTCGAGTGCGGGCGGCTCTGA 

881 SACI, 

ThrValArgLeuArgAlaTyrMetAsnThrProGlyLeuProValCysGlnAspHisLeu 
902 ACAGTTAGGCTACGAGCGTACATGAACACCCCGGGGCTTCCCGTGTGCCAGGACCATCTT 
TGTCAATCCGATGCTCGCATGTACTTGTGGGGCCCCGAAGGGCACACGGTCCTGGTAGAA 

931 SMAI XMAI, 

GluPheTrpGluGlyValPheThrGlyLeuThrHisIleAspAlaHisPheLeuSerGln 
962 GAATTTTGGGAGGGCGTCTTTACAGGCCTCACTCATATAGATGCCCACTTTCTATCCCAG 
CTTAAAACCCTCCCGCAGAAATGTCCGGAGTGAGTATATCTACGGGTGAAAGATAGGGTC 

985 STUI, 

ThrLysGlnSerGlyGluAsnLeuProTyrLeuValAlaTyrGlnAlaThrValCysAla 
1022 ACAAAGCAGAGTGGGGAGAACCTTCCTTACCTGGTAGCGTACCAAGCCACCGTGTGCGCT 
TGTTTCGTCTCACCCCTCTTGGAAGGAATGGACCATCGCATGGTTCGGTGGCACACGCGA 

1069 DRA3, 

ArgAlaGlnAlaProProProSerTrpAspGlnMetTrpLysCysLeuIleArgLeuLys 
1082 AGGGCTCAAGCCCCTCCCCCATCGTGGGACCAGATGTGGAAGTGTTTGATTCGCCTCAAG 
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TCCCGAGTTCGGGGAGGGGGTAGC^CCCTGGTCTACACCTTCACAAACTAAGCGGAGTTC 

ProThrLeuHisGlyProThrProLeuLeuTyrArgLeuGlyAlaValGlnAsnGlulie 
1142 CCCACCCTCCATGGGCCAACACCCCTGCTATACAGACTGGGCGCTGTTCAGAATGAAATC 
GGGTGGGAGGTACCCGGTTGTGGGGACGATATGTCTGACCCGCGACAAGTCTTACTTTAG 

1150 NCOI, 

ThrLeuThrHisProValThrLysTyrlleHetThrCysMetSerAlaAspLeuGluVal 
1202 ACCCTGACGCACCCAGTCACCAAATACATCATGACATGCATGTCGGCCGACCTGGAGGTC 
TGGGACTGCGTGGGTCAGTGGTTTATGTAGTACTGTACGTACAGCCGGCTGGACCTCCAG 

1230 BSPH1, 1234 DRD1, 1237 AVA3, 1245 EAG1 XMA3, 1250 DRD1 , 



ValThrSerThrTrpValLeuValGlyGlyValLeuAlaAlaLeuAlaAlaTyrCysLeu 
1262 GTCACGAGCACCTGGGTGCTCGTTGGCGGCGTCCTGGCTGCTTTGGCCGCGTATTGCCTG 
CAGTGCTCGTGGACCCACGAGCAACCGCCGCAGGACCGACGAAACCGGCGCATAACGGAC 

SerThrGlyCysValVallleValGlyArgValValLeuSerGlyLysProAlallelle 
1322 TCAACAGGCTGCGTGGTCATAGTGGGCAGGGTCGTCTTGTCCGGGAAGCCGGCAATCATA 
AGTTGTCCGACGCACCAGTATCACCCGTCCCAGCAGAACAGGCCCTTCGGCCGTTAGTAT 

1369 NAEI, 

ProAspArgGluValLeuTyrArgGluPheAspGluMetGluGluCysSerGlnHisLeu 
1382 CCTGACAGGGAAGTCCTCTACCGAGAGTTCGATGAGATGGAAGAGTGCTCTCAGCACTTA 
GGACTGTCCCTTCAGGAGATGGCTCTCAAGCTACTCTACCTTCTCACGAGAGTCGTGAAT 

1385 DRD1, 

ProTyrlleGluGlnGlyMetMetLeuAlaGluGlnPheLysGlnLysAlaLeuGlyLeu 
14 4 2 CCGTACATCGAGCAAGGGATGATGCTCGCCGAGCAGTTCAAGCAGAAGGCCCTCGGCCTC 
GGCATGTAGCTCGTTCCCTACTACGAGCGGCTCGTCAAGTTCGTCTTCCGGGAGCCGGAG 

LeuGlnThrAlaSerArgGlnAlaGluVallleAlaProAlaValGlnThrAsnTrpGln 
1502 CTGCAGACCGCGTCCCGTCAGGCAGAGGTTATCGCCCCTGCTGTCCAGACCAACTGGCAA 
GACGTCTGGCGCAGGGCAGTCCGTCTCCAATAGCGGGGACGACAGGTCTGGTTGACCGTT 

1502 PSTI, 1507 TTH3I, 

LysLeuGluThrPheTrpAlaLysHisMetTrpAsnPhelleSerGlylleGlnTyrLeu 
1562 AAACTCGAGACCTTCTGGGCGAAGCATATGTGGAACTTCATCAGTGGGATACAATACTTG 
TTTGAGCTCTGGAAGACCCGCTTCGTATACACCTTGAAGTAGTCACCCTATGTTATGAAC 

1565 XHOI, 1586 NDEI, 

AlaGlyLeuSerThrLeuProGlyAsnProAlalleAlaSerLeuMetAlaPheThrAla 
1622 GCGGGCTTGTCAACGCTGCCTGGTAACCCCGCCATTGCTTCATTGATGGCTTTTACAGCT 
CGCCCGAACAGTTGCGACGGACCATTGGGGCGGTAACGAAGTAACTACCGAAAATGTCGA 

1643 BSTE2, 1677 ALWN1 PVU2 , 

AlaValThrSerProLeuThrThrSerGlnThrLeuLeuPheAsnlleLeuGlyGlyTrp 
1682 GCTGTCACCAGCCCACTAACCACTAGCCAAACCCTCCTCTTCAACATATTGGGGGGGTGG 
CGACAGTGGTCGGGTGATTGGTGATCGGTTTGGGAGGAGAAGTTGTATAACCCCCCCACC 
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ValAlaAlaGlnLeuAlaAlaProGlyAlaAlaThrAl&PheValGlyAlaGlyLeuAla 
1742 GTGGCTGCCCAGCTCGCCGCCCCCGGTGCCGCTACTGCCTTTGTGGGCGCTGGCTTAGCT 
CACCGACGGGTCGAGCGGCGGGGGCCACGGCGATGACGGAAACACCCGCGACCGAATCGA 

1794 ESP1, 

GlyAlaAlalleGlySerValGlyLeuGlyLysValLeuIleAspIleLeuAlaGlyTyr 
1802 GGCGCCGCCATCGGCAGTGTTGGACTGGGGAAGGTCCTCATAGACATCCTTGCAGGGTAT 
CCGCGGCGGTAGCCGTCACAACCTGACCCCTTCCAGGAGTATCTGTAGGAACGTCCCATA 

1802 KAS1 NARI, 

GlyAlaGlyValAlaGlyAlaLeuValAlaPheLysIleMetSerGlyGluValProSer 
1862 GGCGCGGGCGTGGCGGGAGCTCTTGTGGCATTCAAGATCATGAGCGGTGAGGTCCCCTCC 
CCGCGCCCGCACCGCCCTCGAGAACACCGTAAGTTCTAGTACTCGCCACTCCAGGGGAGG 

1878 SACI, 1899 BSPH1, 

ThrGluAspLeuValAsnLeuLeuProAlalleLeuSerProGlyAlaLeuValValGly 
1922 ACGGAGGACCTGGTCAATCTACTGCCCGCCATCCTCTCGCCCGGAGCCCTCGTAGTCGGC 
TGCCTCCTGGACCAGTTAGATGACGGGCGGTAGGAGAGCGGGCCTCGGGAGCATCAGCCG 

1928 TTH3I, 

ValValCysAlaAlalleLeuArgArgHisValGlyProGlyGluGlyAlaValGlnTrp 
1982 GTGGTCTGTGCAGCAATACTGCGCCGGCACGTTGGCCCGGGCGAGGGGGCAGTGCAGTGG 
CACCAGACACGTCGTTATGACGCGGCCGTGCAACCGGGCCCGCTCCCCCGTCACGTCACC 

2004 NAEI, 2017 SMAI XMAI, 

MetAsnArgLeuIleAlaPheAlaSerArgGlyAsnHisValSerProThrHisTyrVal 
204 2 ATGAACCGGCTGATAGCCTTCGCCTCCCGGGGGAACCATGTTTCCCCCACGCACTACGTG 
TACTTGGCCGACTATCGGAAGCGGAGGGCCCCCTTGGTACAAAGGGGGTGCGTGATGCAC 

2067 SMAI XMAI , 2093 DRA3, 

ProGluSerAspAlaAlaAlaArgValThrAlalleLeuSerSerLeuThrValThrGln 
2102 CCGGAGAGCGATGCAGCTGCCCGCGTCACTGCCATACTCAGCAGCCTCACTGTAACCCAG 
GGCCTCTCGCTACGTCGACGGGCGCAGTGACGGTATGAGTCGTCGGAGTGACATTGGGTC 

2115 PVU2, 2159 ALWN1, 

LeuLeuArgArgLeuHisGlnTrpIleSerSerGluCysThrThrProCysSerGlySer 
2162 CTCCTGAGGCGACTGCACCAGTGGATAAGCTCGGAGTGTACCACTCCATGCTCCGGTTCC 
GAGGACTCCGCTGACGTGGTCACCTATTCGAGCCTCACATGGTGAGGTACGAGGCCAAGG 

2164 MST2, 2220 ECON1, 

■ TrpLeuArgAspIleTrpAspTrpIleCysGluValLeuSerAspPheLysThrTrpLeu 
2222 TGGCTAAGGGACATCTGGGACTGGATATGCGAGGTGTTGAGCGACTTTAAGACCTGGCTA 
ACCGATTCCCTGTAGACCCTGACCTATACGCTCCACAACTCGCTGAAATTCTGGACCGAT 

LysAlaLysLeuMetProGlnLeuProGlylleProPheValSerCysGlnArgGlyTyr 
2282 ^ AAAGCTAAGCTCATGCCACAGCTGCCTGGGATCCCCTTTGTGTCCTGCCAGCGCGGGTAT 
TTTCGATTCGAGTACGGTGTCGACGGACCCTAGGGGAAACACAGGACGGTCGCGCCCATA 

2285 ESP1, 2300 PVU2, 2310 BAMHI, 
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LysGlyValTrpArgGlyAspGlylleMetHisThrArgCysHisCysGlyAlaGluIle 
2342 AAGGGGGTCTGGCGAGGGGACGGCATCATGCACACTCGCTGCCACTGTGGAGCTGAGATC 
TTCCCCCAGACCGCTCCCCTGCCGTAGTACGTGTGAGCGACGGTGACACCTCGACTCTAG 

ThrGlyHisValLysAsnGlyThrMetArglleValGlyProArgThrCysArgAsnMet 
2402 ACTGGACATGTCAAAAACGGGACGATGAGGATCGTCGGTCCTAGGACCTGCAGGAACATG 
TGACCTGTACAGTTTTTGCCCTGCTACTCCTAGCAGCCAGGATCCTGGACGTCCTTGTAC 

2425 BSAB1, 2441 AVR2, 2448 SSE83871, 2449 PSTI, 

TrpSerGlyThrPheProIleAsnAlaTyrThrThrGlyProCysThrProLeuProAla 
24 62 TGGAGTGGGACCTTCCCCATTAATGCCTACACCACGGGCCCCTGTACCCCCCTTCCTGCG 
ACCTCACCCTGGAAGGGGTAATTACGGATGTGGTGCCCGGGGACATGGGGGGAAGGACGC 

2480 ASE1, 2497 APAI , 

ProAsnTyrThrPheAlaLeuTrpArgValSerAlaGluGluTyrValGluIleArgGln 
2 522 CCGAACTACACGTTCGCGCTATGGAGGGTGTCTGCAGAGGAATACGTGGAGATAAGGCAG 
GGCTTGATGTGCAAGCGCGATACCTCCCACAGACGTCTCCTTATGCACCTCTATTCCGTC 

2553 PSTI , 

ValGlyAspPheHisTyrValThrGlyMetThrThrAspAsnLeuLysCysProCysGln 
2582 GTGGGGGACTTCCACTACGTGACGGGTATGACTACTGACAATCTTAAATGCCCGTGCCAG 
CACCCCCTGAAGGTGATGCACTGCCCATACTGATGACTGTTAGAATTTACGGGCACGGTC 



ValProSerProGluPhePheThrGluLeuAspGlyValArgLeuHisArgPheAlaPro 
264 2 GTCCCATCGCCCGAATTTTTCACAGAATTGGACGGGGTGCGCCTACATAGGTTTGCGCCC 
CAGGGTAGCGGGCTTAAAAAGTGTCTTAACCTGCCCCACGCGGATGTATCCAAACGCGGG 

ProCysLysProLeuLeuArgGluGluValSerPheArgValGlyLeuHisGluTyrPro 
2702 CCCTGCAAGCCCTTGCTGCGGGAGGAGGTATCATTCAGAGTAGGACTCCACGAATACCCG 
GGGACGTTCGGGAACGACGCCCTCCTCCATAGTAAGTCTCATCCTGAGGTGCTTATGGGC 

2757 HGIE2, 

ValGlySerGlnLeuProCysGluProGluProAspValAlaValLeuThrSerMetLeu 
2762 GTAGGGTCGCAATTACCTTGCGAGCCCGAACCGGACGTGGCCGTGTTGACGTCCATGCTC 
CATCCCAGCGTTAATGGAACGCTCGGGCTTGGCCTGCACCGGCACAACTGCAGGTACGAG 

2809 AAT2, 

ThrAspProSerHisIleThrAlaGluAlaAlaGlyArgArgLeuAlaArgGlySerPro 
2822 ACTGATCCCTCCCATATAACAGCAGAGGCGGCCGGGCGAAGGTTGGCGAGGGGATCACCC 
TGACTAGGGAGGGTATATTGTCGTCTCCGCCGGCCCGCTTCCAACCGCTCCCCTAGTGGG 

2850 EAG1 XMA3, 

' ; ProSerValAlaSerSerSerAlaSerGlnLeuSerAlaProS rLeuLysAlaThrCys 
2882 CCCTCTGTGGCCAGCTCCTCGGCTAGCCAGCTATCCGCTCCATCTCTCAAGGCAACTTGC 
- _ GGGAGACACCGGTCGAGGAGCCGATCGGTCGATAGGCGAGGTAGAGAGTTCCGTTGAACG 

2889 BALI, 2903 NHEX, 
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ThrAlaAsnHisAspSerProAspAlaGluLeuXleGluAlaAsnLeuLeuTrpArgGln 
2942 ACCGCTAACCATGACTCCCCTGATGCTGAGCTCATAGAGGCCAACCTCCTATGGAGGCAG 
TGGCGATTGGTACTGAGGGGACTACGACTCGAGTATCTCCGGTTGGAGGATACCTCCGTC 

2966 ESP1, 2969 SAC I, 

GluMetGlyGlyAsnlleThrArgValGluSerGluAsnLys Valval I leLeuAspSer 
3002 GAGATGGGCGGCAACATCACCAGGGTTGAGTCAGAAAACAAAGTGGTGATTCTGGACTCC 
CTCTACCCGCCGTTGTAGTGGTCCCAACTCAGTCTTTTGTTTCACCACTAAGACCTGAGG 

PheAspProLeuValAlaGluGluAspGluArgGluIleSerValProAlaGluIleLeu 
3062 TTCGATCCGCTTGTGGCGGAGGAGGACGAGCGGGAGATCTCCGTACCCGCAG AAATCCTG 
AAGCTAGGCGAACACCGCCTCCTCCTGCTCGCCCTCTAGAGGCATGGGCGTCTTTAGGAC 

3096 BGL2, 

ArgLysSerArgArgPheAlaGlnAlaLeuProValTrpAlaArgProAspTyrAsnPro 
3122 CGGAAGTCTCGGAGATTCGCCCAGGCCCTGCCCGTTTGGGCGCGGCCGGACTATAACCCC 
GCCTTCAGAGCCTCTAAGCGGGTCCGGGACGGGCAAACCCGCGCCGGCCTGATATTGGGG 

3143 ALWN1, 3164 EAG1 XMA3, 

ProLeuValGluThrTrpLysLysProAspTyrGluProProValValHisGlyCysPro 
3182 CCGCTAGTGGAGACGTGGAAAAAGCCCGACTACGAACCACCTGTGGTCCATGGCTGCCCG 
GGCGATCACCTCTGCACCTTTTTCGGGCTGATGCTTGGTGGACACCAGGTACCGACGGGC 

3217 HGIE2, 3229 NCOI, 

LeuProProProLysSerProProValProProProArgLysLysArgThrValValLeu 
324 2 CTTCCACCTCCAAAGTCCCCTCCTGTGCCTCCGCCTCGGAAGAAGCGGACGGTGGTCCTC 
GAAGGTGGAGGTTTCAGGGGAGGACACGGAGGCGGAGCCTTCTTCGCCTGCCACCAGGAG 

ThrGluSerThrLeuSerThrAlaLeuAlaGluLeuAlaThrArgSerPheGlySerSer 
3302 ACTGAATCAACCCTATCTACTGCCTTGGCCGAGCTCGCCACCAGAAGCTTTGGCAGCTCC 
TGACTTAGTTGGGATAGATGACGGAACCGGCTCGAGCGGTGGTCTTCGAAACCGTCGAGG 

3332 SACI, 3346 HIND3, 

SerThrSerGlylleThrGlyAspAsnThrThrThrSerSerGluProAlaProSerGly 
3362 TCAACTTCCGGCATTACGGGCGACAATACGACAACATCCTCTGAGCCCGCCCCTTCTGGC 
AGTTGAAGGCCGTAATGCCCGCTGTTATGCTGTTGTAGGAGACTCGGGCGGGGAAGACCG 

CysProProAspSerAspAlaGluSerTyrSerSerMetProProLeuGluGlyGluPro 
3422 TGCCCCCCCGACTCCGACGCTGAGTCCTATTCCTCCATGCCCCCCCTGGAGGGGGAGCCT 
ACGGGGGGGCTGAGGCTGCGACTCAGGATAAGGAGGTACGGGGGGGACCTCCCCCTCGGA 

3437 EAM11051, 

GlyAspProAspLeuSerAspGlySerTrpSerThrValSerSerGluAlaAsnAlaGlu 
34 82 GGGGATCCGGATCTTAGCGACGGGTCATGGTCAACGGTCAGTAGTGAGGCCAACGCGGAG 
CCCCTAGGCCTAGAATCGCTGCCCAGTACCAGTTGCCAGTCATCACTCCGGTTGCGCCTC 

3484 BAMHI, 3485 BSAB1, 3487 BSPE1, 

AspValValCysCysSerMetSerTyrSerTrpThrGlyAlaLeuValThrProCysAla 
3542 GATGTCGTGTGCTGCTCAATGTCTTACTCTTGGACAGGCGCACTCGTCACCCCGTGCGCC 
CTACAGCACACGACGAGTTACAGAATGAGAACCTGTCCGCGTGAGCAGTGGGGCACGCGG 
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3589 DRA3, 3600 SAC2, 

AlaGluGluGlnLysLeuProIleAsnAlaLeuSerAsnSerLeuLeuArgHisHisAsn 
3602 GCGGAAGAACAGAAACTGCCCATCAATGCACTAAGCAS.CTCGTTGCTACGTCACCACAAT 
CGCCTTCTTGTCTTTGACGGGTAGTTACGTGATTCGTTGAGCAACGATGCAGTGGTGTTA 

3611 ALWN1, 3655 PFLM1, 

LeuValTyrSerThrThrSerArgSerAlaCysGlnArgGlnLysLysValThrPheAsp 
3662 TTGGTGTATTCCACCACCTCACGCAGTGCTTGCCAAAGGCAGAAGAAAGTCACATTTGAC 
AACCACATAAGGTGGTGGAGTGCGTCACGAACGGTTTCCGTCTTCTTTCAGTGTAAACTG 

3681 DRA3, 

ArgLeuGlnValLeuAspSerHisTyrGlnAspValLeuLysGluValLysAlaAlaAla 
3722 AGACTGCAAGTTCTGGACAGCCATTACCAGGACGTACTCAAGGAGGTTAAAGCAGCGGCG 
TCTGACGTTCAAGACCTGTCGGTAATGGTCCTGCATGAGTTCCTCCAATTTCGTCGCCGC 

SerLysValLysAlaAsnLeuLeuScrValGluGluAlaCysSerLeuThrProProHis 
37 82 TCAAAAGTGAAGGCTAACTTGCTATCCGTAGAGGAAGCTTGCAGCCTGACGCCCCCACAC 
AGTTTTCACTTCCGATTGAACGATAGGCATCTCCTTCGAACGTCGGACTGCGGGGGTGTG 

3816 HIND3, 

SerAlaLysSerLysPheGlyTyrGlyAlaLysAspValArgCysHisAlaArgLysAla 
3842 TCAGCCAAATCCAAGTTTGGTTATGGGGCAAAAGACGTCCGTTGCCATGCCAGAAAGGCC 
AGTCGGTTTAGGTTCAAACCAATACCCCGTTTTCTGCAGGCAACGGTACGGTCTTTCCGG 

3875 AAT2, 3890 BGLI, 

ValThrHisIleAsnSerValTrpLysAspLeuLeuGluAspAsnValThrProIleAsp 
3902 GTAACCCACATCAACTCCGTGTGGAAAGACCTTCTGGAAGACAATGTAACACCAATAGAC 
CATTGGGTGTAGTTGAGGCACACCTTTCTGGAAGACCTTCTGTTACATTGTGGTTATCTG 

ThrThrlleMetAlaLysAsnGluValPheCysValGlnProGluLysGlyGlyArgLys 
3962 ACTACCATCATGGCTAAGAACGAGGTTTTCTGCGTTCAGCCTGAGAAGGGGGGTCGTAAG 
TGATGGTAGTACCGATTCTTGCTCCAAAAGACGCAAGTCGGACTCTTCCCCCCAGCATTC 

ProAlaArgLeuIleValPheProAspLeuGlyValArgValCysGluLysMetAlaLeu 
4 022 CCAGCTCGTCTCATCGTGTTCCCCGATCTGGGCGTGCGCGTGTGCGAAAAGATGGCTTTG 
GGTCGAGCAGAGTAGCACAAGGGGCTAGACCCGCACGCGCACACGCTTTTCTACCGAAAC 

TyrAspValValThrLysLeuProLeuAlaValMetGlySerSerTyrGlyPheGlnTyr 
4 082 TACGACGTGGTTACAAAGCTCCCCTTGGCCGTGATGGGAAGCTCCTACGGATTCCAATAC 
ATGCTGCACCAATGTTTCGAGGGGAACCGGCACTACCCTTCGAGGATGCCTAAGGTTATG 

SerProGlyGlnArgValGluPheLeuValGlnAlaTrpLysSerLysLysThrProMet 
414 2 TCACCAGGACAGCGGGTTGAATTCCTCGTGCAAGCGTGGAAGTCCAAGAAAACCCCAATG 
AGTGGTCCTGTCGCCCAACTTAAGGAGCACGTTCGCACCTTCAGGTTCTTTTGGGGTTAC 

4160 ECORI, 

GlyPheSerTyrAspThrArgCysPheAspSerThrValThrGluSerAspIleArgThr 
42 02 GGGTTCTCGTATGATACCCGCTGCTTTGACTCCACAGTCACTGAGAGCGACATCCGTACG 
CCCAAGAGCATACTATGGGCGACGAAACTGAGGTGTCAGTGACTCTCGCTGTAGGCATGC 
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4229 DRD1, 4236 ALWN1 , 

GluGluAlalleTyrGlnCysCysAspLeuAspProGlnAlaArgValAlalleLysSer 
4262 GAGGAGGCAATCTACCAATGTTGTGACCTCGACCCCCAAGCCCGCGTGGCCATCAAGTCC 
CTCCTCCGTTAGATGGTTACAACACTGGAGCTGGGGGTTCGGGCGCACCGGTAGTTCAGG 

4301 BGLI, 4308 BALI , 

LeuThrGluArgLeuTyrValGlyGlyProLeuThrAsnSerArgGlyGluAsnCysGly 
4 322 CTCACCGAGAGGCTTTATGTTGGGGGCCCTCTTACCAATTCAAGGGGGGAGAACTGCGGC 
GAGTGGCTCTCCGAAATACAACCCCCGGGAGAATGGTTAAGTTCCCCCCTCTTGACGCCG 

4 34 5 APAI, 

TyrArgArgCysArgAlaSerGlyValLeuThrThrSerCysGlyAsnThrLeuThrCys 
4 382 TATCGCAGGTGCCGCGCGAGCGGCGTACTGACAACTAGCTGTGGTAACACCCTCACTTGC 
ATAGCGTCCACGGCGCGCTCGCCGCATGACTGTTGATCGACACCATTGTGGGAGTGAACG 

TyrlleLysAlaArgAlaAlaCysArgAlaAlaGlyLeuGlnAspCysThrMetLeuVai 
4 4 4 2 TACATCAAGGCCCGGGCAGCCTGTCGAGCCGCAGGGCTCCAGGACTGCACCATGCTCGTG 
ATGTAGTTCCGGGCCCGTCGGACAGCTCGGCGTCCCGAGGTCCTGACGTGGTACGAGCAC 

4 4 52 SMAI XMAI, 

CysGlyAspAspLeuValVallleCysGluSerAlaGlyValGlnGluAspAlaAlaSer 
4 502 TGTGGCGACGACTTAGTCGTTATCTGTGAAAGCGCGGGGGTCCAGGAGGACGCGGCGAGC 
ACACCGCTGCTGAATCAGCAATAGACACTTTCGCGCCCCCAGGTCCTCCTGCGCCGCTCG 

4508 DRD1, 4511 TTH3I, 

LeuArgAlaPheThrGluAlaMetThrArgTyrSerAlaProProGlyAspProProGln 
4 562 CTGAGAGCCTTCACGGAGGCTATGACCAGGTACTCCGCCCCCCCTGGGGACCCCCCACAA 
GACTCTCGGAAGTGCCTCCGATACTGGTCCATGAGGCGGGGGGGACCCCTGGGGGGTGTT 

ProGluTyrAspLeuGluLeuIleThrSerCysSerSerAsnValSerValAlaHisAsp 
4 622 CCAGAATACGACTTGGAGCTCATAACATCATGCTCCTCCAACGTGTCAGTCGCCCACGAC 
GGTCTTATGCTGAACCTCGAGTATTGTAGTACGAGGAGGTTGCACAGTCAGCGGGTGCTG 

4 637 SACI, 

GlyAlaGlyLysArgValTyrTyrLeuThrArgAspProThrThrProLeuAlaArgAla 
4 682 GGCGCTGGAAAGAGGGTCTACTACCTCACCCGTGACCCTACAACCCCCCTCGCGAGAGCT 
CCGCGACCTTTCTCCCAGATGATGGAGTGGGCACTGGGATGTTGGGGGGAGCGCTCTCGA 

4731 NRUI, 

AlaTrpGluThrAlaArgHisThrProValAsnSerTrpLeuGlyAsnllelleMetPhe 
4742 GCGTGGGAGACAGCAAGACACACTCCAGTCAATTCCTGGCTAGGCAACATAATCATGTTT 
CGCACCCTCTGTCGTTCTGTGTGAGGTCAGTTAAGGACCGATCCGTTGTATTAGTACAAA 

AlaProThrLeuTrpAlaArgMetll LeuMetThrHisPhePh SerValLeuIl Ala 
4 802 GCCCCCACACTGTGGGCGAGGATGATACTGATGACCCATTTCTTTAGCGTCCTTATAGCC 
CGGGGGTGTGACACCCGCTCCTACTATGACTACTGGGTAAAGAAATCGCAGGAATATCGG 

4806 PFLM1, 4807 DRA3, 

ArgAspGlnLeuGluGlnAlaLeuAspCysGluIl TyrGlyAlaCysTyrSerlleGlu 
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4862 AGGGACCAGCTTGAACAGGCCCTCGATTGCGAGATCTACGGGGCCTGCTACTCCATAGAA 
TCCCTGGTCGAACTTGTCCGGGAGCTAACGCTCTAGATGCCCCGGACGATGAGGTATCTT 

4893 BGL2, 

ProLeuAspLeuProProIlelleGlnArgLeuHisGlyLeuSerAlaPheSerLeuHis 
4 922 CCACTGGATCTACCTCCAATCATTCAAAGACTCCATGGCCTCAGCGCATTTTCACTCCAC 
GGTGACCTAGATGGAGGTTAGTAAGTTTCTGAGGTACCGGAGTCGCGTAAAAGTGAGGTG 

4 954 NCOI, 

SerTyrSerProGlyGluIleAsnArgValAlaAlaCysLeuArgLysLeuGlyValPro 
4 982 AGTTACTCTCCAGGTGAAATCAATAGGGTGGCCGCATGCCTCAGAAAACTTGGGGTACCG 
TCAATGAGAGGTCCACTTTAGTTATCCCACCGGCGTACGGAGTCTTTTGAACCCCATGGC 

5015 SPHI, 5035 KPNI, 

ProLeuArgAlaTrpArgHisArgAlaArgSerValArgAlaArgLeuLeuAlaArgGly 
5042 CCCTTGCG A.GCTTGGAGACACCGGGCCCGGAGCGTCCGCGeTAGGCTTCTGGCCAGAGGA 
GGGAACGCTCGAACCTCTGTGGCCCGGGCCTCGCAGGCGCGATCCGAAGACCGGTCTCCT 

5064 APAI, 5091 BALI, 

GlyArgAlaAlalleCysGlyLysTyrLeuPheAsnTrpAlaValArgThrLysLeuLys 
5102 GGCAGGGCTGCCATATGTGGCAAGTACCTCTTCAACTGGGCAGTAAGAACAAAGCTCAAA 
CCGTCCCGACGGTATACACCGTTCATGGAGAAGTTGACCCGTCATTCTTGTTTCGAGTTT 

5113 NOEI , 

LeuThrProIleAlaAlaAlaGlyGlnLeuAspLeuSerGlyTrpPheThrAlaGlyTyr 
5162 CTCACTCCAATAGCGGCCGCTGGCCAGCTGGACTTGTCCGGCTGGTTCACGGCTGGCTAC 
GAGTGAGGTTATCGCCGGCGACCGGTCGACCTGAACAGGCCGACCAAGTGCCGACCGATG 

5174 NOTI, 5175 EAG1 XMA3, 5182 BALI, 5186 PVU2, 

SerGlyGlyAspIleTyrHisSerValSerHisAlaArgProArgTrpIleTrpPheCys 
5222 AGCGGGGGAGACATTTATCACAGCGTGTCTCATGCCCGGCCCCGCTGGATCTGGTTTTGC 
TCGCCCCCTCTGTAAATAGTGTCGCACAGAGTACGGGCCGGGGCGACCTAGACCAAAACG 

5240 DRA3, 

LeuLeuLeuLeuAlaAlaGlyValGlylleTyrLeuLeuProAsnArgMetSerThrAsn 
5282 CTACTCCTGCTTGCTGCAGGGGTAGGCATCTACCTCCTCCCCAACCGAATGAGCACGAAT 
GATGAGGACGAACGACGTCCCCATCCGTAGATGGAGGAGGGGTTGGCTTACTCGTGCTTA 

5295 PSTI, 

ProLysProGlnArgLysThrLysArgAsnThrAsnArgArgProGlnAspValLysPhe 
5342 CCTAAACCTCAAAGAAAGACCAAACGTAACACCAACCGGCGGCCGCAGGACGTCAAGTTC 
GGATTTGGAGTTTCTTTCTGGTTTGCATTGTGGTTGGCCGCCGGCGTCCTGCAGTTCAAG 

5380 NOTI, 5381 EAG1 XMA3, 5390 AAT2, 5401 SMAI XMAI, 

ProGlyGlyGlyGlnlleValGlyGlyValTyrLeuLeuProArgArgGlyProArgLeu 
5402 C.CGGGTGGCGGTCAGATCGTTGGTGGAGTTTACTTGTTGCCGCGCAGGGGCCCTAGATTG 
GGCCCACCGCCAGTCTAGCAACCACCTCAAATGAACAACGGCGCGTCCCCGGGATCTAAC 
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544 9 APAI, 

GlyValArgAlaThrArgLysThrSerGluArgSerGlnProArgGlyArgArgGlnPro 
54 62 GGTGTGCGCGCGACGAGAAAGACTTCCGAGCGGTCGCAACCTCGAGGTAGACGTCAGCCT 
CCACACGCGCGCTGCTCTTTCTGAAGGCTCGCCAGCGTTGGAGCTCCATCTGCAGTCGGA 

5467 BSSH2, 5478 XMNI, 5502 XHOI, 5511 AAT2, 

IlfeProLysAlaArgArgProGluGlyArgThrTrpAlaGlnProGlyTyrProTrpPro 
5522 ATCCCCAAGGCTCGTCGGCCCGAGGGCAGGACCTGGGCTCAGCCCGGGTACCCTTGGCCC 
TAGGGGTTCCGAGCAGCCGGGCTCCCGTCCTGGACCCGAGTCGGGCCCATGGGAACCGGG 

554 8 ALWN1 , 5558 ESP1, 5564 SMAI XMAI, 5568 KPNI, 

LeuTyrGlyAsnGluGlyCysGlyTrpAlaGlyTrpLeuLeuSerProArgGlySerArg 
5582 CTCTATGGCAATGAGGGCTGCGGGTGGGCGGGATGGCTCCTGTCTCCCCGTGGCTCTCGG 
GAGATACCGTTACTCCCGACGCCCACCCGCCCTACCGAGGACAGAGGGGCACCGAGAGCC 

ProSerTrpGlyProThrAspProArgArgArgSerArgAsnLeuGlyLysOC AM 
564 2 CCTAGCTGGGGCCCCACAGACCCCCGGCGTAGGTCGCGCAATTTGGGTAAGTAATAGTCG 
GGATCGACCCCGGGGTGTCTGGGGGCCGCATCCAGCGCGTTAAACCCATTCATTATCAGC 

5650 APAI, 5698 SALI, 



5702 AC 
TG 
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MetAlaAlaTysrAlaAlaGlnGlyTyrLysValLeuValLeuAsn 
2 AGCTTACAAAACAAAATGGCTGCATATGC^GCTCAGGGCTATAAGGTGCTAGTACTCAAC 
TCGAATGTTTTGTTTTACCGACGTATACGTCGAGTCCCGATATTCCACGATCATGAGTTG 

1 HIND3, 24 NDEI, 52 SCAI, 

ProSerValAlaAlaThrLeuGlyPheGlyAlaTyrMetSerLysAlaKisGlylleAsp 
62 CCCTCTGTTGCTGCAACACTGGGCTTTGGTGCTTACATGTCCAAGGCTCATGGGATCGAT 
GGGAGACAACGACGTTGTGACCCGAAACCACGAATGTACAGGTTCCGAGTACCCTAGCTA 

116 CLAI, 

ProAsnlleArgThrGlyValArgThrlleThrThrGlySerProIleThrTyrSerThr 
122 CCTAACATCAGGACCGGGGTGAGAACAATTACCACTGGCAGCCCCATCACGTACTCCACC 
GGATTGTAGTCCTGGCCCCACTCTTGTTAATGGTGACCGTCGGGGTAGTGCATGAGGTGG 

TyrGlyLysPheLeuAlaAspGlyGlyCysSerGlyGlyAlaTyrAspIlellelleCys 
182 TACGGCAAGTTCCTTGCCGACGGCGGGTGCTCGGGGGGCGCTTATGACATAATAATTTGT 
ATGCCGTTCAAGGAACGGCTGCCGCCCACGAGCCCCCCGCGAATACTGTATTATTAAACA 

AspGluCysHisSerThrAspAlaThrSerlleLeuGlylleGlyThrValLeuAspGln 
24 2 GACGAGTGCCACTCCACGGATGCCACATCCATCTTGGGCATTGGCACTGTCCTTGACCAA 
CTGCTCACGGTGAGGTGCCTACGGTGTAGGTAGAACCCGTAACCGTGACAGGAACTGGTT 

AlaGluThrAlaGlyAlaArgLeuValValLeuAlaThrAlaThrProProGlySerVal 
302 GCAGAGACTGCGGGGGCGAGACTGGTTGTGCTCGCCACCGCCACCCCTCCGGGCTCCGTC 
CGTCTCTGACGCCCCCGCTCTGACCAACACGAGCGGTGGCGGTGGGGAGGCCCGAGGCAG 

303 ALWN1, 

ThrValProHisProAsnlleGluGluValAlaLeuSerThrThrGlyGluIleProPhe 
362 ACTGTGCCCCATCCCAACATCGAGGAGGTTGCTCTGTCCACCACCGGAGAGATCCCTTTT 
TGACACGGGGTAGGGTTGTAGCTCCTCCAACGAGACAGGTGGTGGCCTCTCTAGGGAAAA 

TyrGlyLysAlalleProLeuGluVallleLysGlyGlyArgHisLeuIlePheCysHis 
422 TACGGCAAGGCTATCCCCCTCGAAGTAATCAAGGGGGGGAGACATCTCATCTTCTGTCAT 
ATGCCGTTCCGATAGGGGGAGCTTCATTAGTTCCCCCCCTCTGTAGAGTAGAAGACAGTA 

SerLysLysLysCysAspGluLeuAlaAlaLysLeuValAlaLeuGlylleAsnAlaVal 
4 82 TCAAAGAAGAAGTGCGACGAACTCGCCGCAAAGCTGGTCGCATTGGGCATCAATGCCGTG 
AGTTTCTTCTTCACGCTGCTTGAGCGGCGTTTCGACCAGCGTAACCCGTAGTTACGGCAC 

AlaTyrTyrArgGlyLeuAspValSerVallleProThrSerGlyAspValValValVal 
542 GCCTACTACCGCGGTCTTGACGTGTCCGTCATCCCGACCAGCGGCGATGTTGTCGTCGTG 
CGGATGATGGCGCCAGAACTGCACAGGCAGTAGGGCTGGTCGCCGCTACAACAGCAGCAC 

550 SAC2, 560 DRD1, 

AlaThrAspAlaLeuMetThrGlyTyrThrGlyAspPheAspS rVallleAspCysAsn 
602 GCAACCGATGCCCTCATGACCGGCTATACCGGCGACTTCGACTCGGTGATAGACTGCAAT 
CGTTGGCTACGGGAGTACTGGCCGATATGGCCGCTGAAGCTGAGCCACTATCTGACGTTA 

615 BSPH1, 
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ThrCysValThrGlnThrValAspPheS srLeuAspProThrPheThrlleGluThrlle 
662 ACGTGTGTCACCCAGACAGTCGATTTCAGCCTTGACCCTACCTTCACCATTGAGACAATC 
TGCACACAGTGGGTCTGTCAGCTAAAGTCGGAACTGGGATGGAAGTGGTAACTCTGTTAG 

Thr Leu ProGlnAspAl a Val Se r Ar gThrGlnAr gAr gGl yAr gThr Gl yAr gGl yLys 
722 ACGCTCCCCCAAGATGCTGTCTCCCGCACTCAACGTCGGGGCAGGACTGGCAGGGGGAAG 
TGCGAGGGGGTTCTACGACAGAGGGCGTGAGTTGCAGCCCCGTCCTGACCGTCCCCCTTC 

ProGlylleTyrArgPheValAlaProGlyGluArgProSerGlyMetPheAspSerSer 
782 CCAGGCATCTACAGATTTGTGGCACCGGGGGAGCGCCCCTCCGGCATGTTCGACTCGTCC 
GGTCCGTAGATGTCTAAACACCGTGGCCCCCTCGCGGGGAGGCCGTACAAGCTGAGCAGG 

816 BGLI, 833 DRD1, 

ValLeuCysGluCysTyrAspAlaGlyCysAlaTrpTyrGluLeuThrProAlaGluThr 
8 42 GTCCTCTGTGAGTGCTATGACGCAGGCTGTGCTTGGTATGAGCTCACGCCCGCCGAGACT 
CAGGAGACACTCACGATACTGCGTCCGACACGAACCATACTCGAGTGCGGGCGGCTCTGA 

881 SACI, 

ThrValArgLeuArgAlaTyrMetAsnThrProGlyLeuProValCysGlnAspHisLeu 
902 ACAGTTAGGCTACGAGCGTACATGAACACCCCGGGGCTTCCCGTGTGCCAGGACCATCTT 
TGTCAATCCGATGCTCGCATGTACTTGTGGGGCCCCGAAGGGCACACGGTCCTGGTAGAA 

931 SMAI XMAI, 

GluPheTrpGluGlyValPheThrGlyLeuThrHisIleAspAlaHisPheLeuSerGln 
962 GAATTTTGGGAGGGCGTCTTTACAGGCCTCACTCATATAGATGCCCACTTTCTATCCCAG 
CTTAAAACCCTCCCGCAGAAATGTCCGGAGTGAGTATATCTACGGGTGAAAGATAGGGTC 

985 STUI, 

ThrLysGlnSerGlyGluAsnLeuProTyrLeuValAlaTyrGlnAlaThrValCysAla 
1022 ACAAAGCAGAGTGGGGAGAACCTTCCTTACCTGGTAGCGTACCAAGCCACCGTGTGCGCT 
TGTTTCGTCTCACCCCTCTTGGAAGGAATGGACCATCGCATGGTTCGGTGGCACACGCGA 

1069 DRA3, 

ArgAlaGlnAlaProProProSerTrpAspGlnMetTrpLysCysLeuIleArgLeuLys 
1082 AGGGCTCAAGCCCCTCCCCCATCGTGGGACCAGATGTGGAAGTGTTTGATTCGCCTCAAG 
TCCCGAGTTCGGGGAGGGGGTAGCACCCTGGTCTACACCTTCACAAACTAAGCGGAGTTC 

ProThrLeuHisGlyProThrProLeuLeuTyrArgLeuGlyAlaValGlnAsnGluIle 
1142 CCCACCCTCCATGGGCCAACACCCCTGCTATACAGACTGGGCGCTGTTCAGAATGAAATC 
GGGTGGGAGGTACCCGGTTGTGGGGACGATATGTCTGACCCGCGACAAGTCTTACTTTAG 

1150 NCOI, 

ThrLeuThrHisProValThrLysTyrlleMetThrCysMetSerAlaAspLeuGluVal 
1202 ACCCTGACGCACCCAGTCACCAAATACATCATGACATGCATGTCGGCCGACCTGGAGGTC 
TGGGACTGCGTGGGTCAGTGGTTTATGTAGTACTGTACGTACAGCCGGCTGGACCTCCAG 

1230 BSPH1, 1234 DRD1, 1237 AVA3, 1245 EAG1 XMA3, 1250 DRD1, 



ValThrSerThrTrpValLeuValGlyGlyValLeuAlaAlaLeuAlaAlaTyrCysLeu 
1262 GTCACGAGCACGTGGGTGCTCGTTGGCGGCGTCCTGGCTGCTTTGGCCGCGTATTGCCTG 
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CAGTGCTCGTGGACCCACGAGCAACCGCCGCAGGACCGACGAAACCGGCGCATAACGGAC 

SerThrGlyCysValValllfeValGlyArgValValLeuSerGlyLysPsroAlallelle 
1322 TCAACAGGCTGCGTGGTCATAGTGGGCAGGGTCGTCTTGTCCGGGAAGCCGGCAATCATA 
AGTTGTCCGACGCACCAGTATCACCCGTCCCAGCAGAACAGGCCCTTCGGCCGTTAGTAT 

1369 NAEX, 

ProAspArgGluValLeuTyrArgGluPheAspGluMetGluGluCysSerGlnHisLeu 
1382 CCTGACAGGGAAGTCCTCTACCGAGAGTTCGATGAGATGGAAGAGTGCTCTCAGCACTTA 
GGACTGTCCCTTCAGGAGATGGCTCTCAAGCTACTCTACCTTCTCACGAGAGTCGTGAAT 

1385 DRD1, 

ProTyrlleGluGlnGlyMetMetLeuAlaGluGlnPheLysGlnLysAlaLeuGlyLeu 
14 4 2 CCGTACATCGAGCAAGGGATGATGCTCGCCGAGCAGTTCAAGCAGAAGGCCCTCGGCCTC 
GGCATGTAGCTCGTTCCCTACTACGAGCGGCTCGTCAAGTTCGTCTTCCGGGAGCCGGAG 

LeuGlnThrAlaSerArgGlnAlaGluVallleAlaProAlaValGlnThrAsnTrpGln 
1502 CTGCAGACCGCGTCCCGTCAGGCAGAGGTTATCGCCCCTGCTGTCCAGACCAAC i GtiCAA 
GACGTCTGGCGCAGGGCAGTCCGTCTCCAATAGCGGGGACGACAGGTCTGGTTGACCGTT 

1502 PSTI, 1507 TTH3I, 

LysLeuGluThrPheTrpAlaLysHisMetTrpAsnPhelleSerGlylleGlnTyrLeu 
1562 AAACTCGAGACCTTCTGGGCGAAGCATATGTGGAACTTCATCAGTGGGATACAATACTTG 
TTTGAGCTCTGGAAGACCCGCTTCGTATACACCTTGAAGTAGTCACCCTATGTTATGAAC 

1565 XHOI, 1586 NDEI, 

AlaGlyLeuSerThrLeuProGlyAsnProAlalleAlaSerLeuMetAlaPheThrAla 
1622 GCGGGCTTGTCAACGCTGCCTGGTAACCCCGCCATTGCTTCATTGATGGCTTTTACAGCT 
CGCCCGAACAGTTGCGACGGACCATTGGGGCGGTAACGAAGTAACTACCGAAAATGTCGA 

164 3 BSTE2, 1677 ALWN1 PVU2 , 

AlaValThrSerProLeuThrThrSerGlnThrLeuLeuPheAsnlleLeuGlyGlyTrp 
1682 GCTGTCACCAGCCCACTAACCACTAGCCAAACCCTCCTCTTCAACATATTGGGGGGGTGG 
CGACAGTGGTCGGGTGATTGGTGATCGGTTTGGGAGGAGAAGTTGTATAACCCCCCCACC 

ValAlaAlaGlnLeuAlaAlaProGlyAlaAlaThrAlaPheValGlyAlaGlyLeuAla 
1742 GTGGCTGCCCAGCTCGCCGCCCCCGGTGCCGCTACTGCCTTTGTGGGCGCTGGCTTAGCT 
CACCGACGGGTCGAGCGGCGGGGGCCACGGCGATGACGGAAACACCCGCGACCGAATCGA 

1794 ESP1, 

GlyAlaAlalleGlySerValGlyLeuGlyLysValLeuIleAspIleLeuAlaGlyTyr 
1,802 GGCGCCGCCATCGGCAGTGTTGGACTGGGGAAGGTCCTCATAGACATCCTTGCAGGGTAT 
CCGCGGCGGTAGCCGTCACAACCTGACCCCTTCCAGGAGTATCTGTAGGAACGTCCCATA 

1802 KAS1 NARI, 

GlyAlaGlyValAlaGlyAlaLeuValAlaPheLysIleMetSerGlyGluValProSer 
1862 GGCGCGGGCGTGGCGGGAGCTCTTGTGGCATTCAAGATCATGAGCGGTGAGGTCCCCTCC 
CCGCGCCCGCACCGCCCTCGAGAACACCGTAAGTTCTAGTACTCGCCACTCCAGGGGAGG 

1878 SACI, 1899 BSPH1, 
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ThrGluAspL<euValAsnL«uLeuProAlaXleLeuSerProGlyAlaLeuValValGly 
1922 ACGGAGGACCTGGTCAATCTACTGCCCGCCATCCTCTCGCCCGGAGCCCTCGTAGTCGGC 
TGCCTCCTGGACCAGTTAGATGACGGGCGGTAGGAGAGCGGGCCTCGGGAGCATGAGCCG 

1928 TTH3X, 

ValValCysAlaAlaXleLeuArgArgHisVaXGXyProGlyGluGlyAlaValGlnTrp 
1982 GTGGTCTGTGCAGCAATACTGCGCCGGCACGTTGGCCCGGGCGAGGGGGCAGTGCAGTGG 
CACCAGACACGTCGTTATGACGCGGCCGTGCAACCGGGCCCGCTCCCCCGTCACGTCACC 

2004 NAZI, 2017 SMAX XMAI, 

MetAsnArgLeuIleAlaPheAlaSerArgGlyAsnHisValSerProThrHisTyrVal 
204 2 ATGAACCGGCTGATAGCCTTCGCCTCCCGGGGGAACCATGTTTCCCCCACGCACTACGTG 
TACTTGGCCGACTATCGGAAGCGGAGGGCCCCCTTGGTACAAAGGGGGTGCGTGATGCAC 

2067 SMAI XMAI, 2093 DRA3, 

ProGluSerAspAlaAlaAlaArgValThrAlalieLeuSerSerLeuThrValThrGln 
2102 CCGGAGAGCGATGCAGCTGCCCGCGTCACTGCCATACTCAGCAGCCTCACTGTAACCCAG 
GGCCTCTCGCTACGTCGACGGGCGCAGTGACGGTATGAGTCGTCGGAGTGACATTGGGTC 

2115 PVU2 , 2159 ALWN1 , 

LeuLeuArgArgLeuHisGlnTrpIleSerSerGluCysThrThrProCysSerGlySer 
2162 CTCCTGAGGCGACTGCACCAGTGGATAAGCTCGGAGTGTACCACTCCATGCTCCGGTTCC 
GAGGACTCCGCTGACGTGGTCACCTATTCGAGCCTCACATGGTGAGGTACGAGGCCAAGG 

2164 MST2, 2220 ECON1, 

TrpLeuArgAspIleTrpAspTrpIleCysGluValLeuSerAspPheLysThrTrpLeu 
2222 TGGCTAAGGGACATCTGGGACTGGATATGCGAGGTGTTGAGCGACTTTAAGACCTGGCTA 
ACCGATTCCCTGTAGACCCTGACCTATACGCTCCACAACTCGCTGAAATTCTGGACCGAT 

LysAlaLysLeuMetProGlnLeuProGlylleProPheValSerCysGlnArgGlyTyr 
2282 AAAGCTAAGCTCATGCCACAGCTGCCTGGGATCCCCTTTGTGTCCTGCCAGCGCGGGTAT 
TTTCGATTCGAGTACGGTGTCGACGGACCCTAGGGGAAACACAGGACGGTCGCGCCCATA 

2285 ESP1, 2300 PVU2, 2310 BAMHI, 

LysGlyValTrpArgGlyAspGlylleMetHisThrArgCysHisCysGlyAlaGluIle 
2342 AAGGGGGTCTGGCGAGGGGACGGCATCATGCACACTCGCTGCCACTGTGGAGCTGAGATC 
TTCCCCCAGACCGCTCCCCTGCCGTAGTACGTGTGAGCGACGGTGACACCTCGACTCTAG 

ThrGlyHisValLysAsnGlyThrMetArglleValGlyProArgThrCysArgAsnMet 
2402 ACTGGACATGTCAAAAACGGGACGATGAGGATCGTCGGTCCTAGGACCTGCAGGAACATG 
TGACCTGTACAGTTTTTGCCCTGCTACTCCTAGCAGCCAGGATCCTGGACGTCCTTGTAC 

2425 BSAB1, 2441 AVR2, 2448 SSE83871, 2449 PSTI, 

TrpSerGlyThrPheProIleAsnAlaTyrThrThrGlyProCysThrProLeuProAla 
24 62 TGGAGTGGGACCTTCCCCATTAATGCCTACACCACGGGCCCCTGTACCCCCCTTCCTGCG 
ACCTCACCCTGGAAGGGGTAATTACGGATGTGGTGCCCGGGGACATGGGGGGAAGGACGC 

2480 ASE1, 2497 APAX, 
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ProAsnTyrThrPheAlaL«uTrpArgValSerAlaGluGluTyrValGluIleArgGln 
2522 CCGAACTACACGTTCGCGCTATGGAGGGTGTCTGCAGAGGAATACGTGGAGATAAGGCAG 
GGCTTGATGTGCAAGCGCGATACCTCCCACAGACGTCTCCTTATGCACCTCTATTCCGTC 

2553 PSTI, 

ValGlyAspPheHisTyrValThrGlyMetThrThrAspAsnLeuLysCysProCysGln 
25 8 2 GTGGGGGACTTCCACTACGTGACGGGTATGACTACTGACAATCTTAAATGCCCGTGCCAG 
CACCCCCTGAAGGTGATGCACTGCCCATACTGATGACTGTTAGAATTTACGGGCACGGTC 

2594 DRA3, 

ValProSerProGluPhePheThrGluLeuAspGlyValArgLeuHisArgPheAlaPro 
264 2 GTCCCATCGCCCGAATTTTTCACAGAATTGGACGGGGTGCGCCTACATAGGTTTGCGCCC 
CAGGGTAGCGGGCTTAAAAAGTGTCTTAACCTGCCCCACGCGGATGTATCCAAACGCGGG 

ProCysLysProLeuLeuArgGluGluValSerPheArgValGlyLeuKisGluTyrPro 
2702 CCCTGCAAGCCCTTGCTGCGGGAGGAGGTATCATTCAGAGTAGGACTCCACGAATACCCG 
GGGACGTTCGGGAACGACGCCCTCCTCCATAGTAAGTCTCATCCTGAGGTGCTTATGGGC 

2757 HGIE2, 

ValGlySerGlnLeuProCysGluProGluProAspValAlaValLeuThrSerMetLeu 
27 62 GTAGGGTCGCAATTACCTTGCGAGCCCGAACCGGACGTGGCCGTGTTGACGTCCATGCTC 
CATCCCAGCGTTAATGGAACGCTCGGGCTTGGCCTGCACCGGCACAACTGCAGGTACGAG 

2809 AAT2, 

ThrAspProSerHisIleThrAlaGluAlaAlaGlyArgArgLeuAlaArgGlySerPro 
2822 ACTGATCCCTCCCATATAACAGCAGAGGCGGCCGGGCGAAGGTTGGCGAGGGGATCACCC 
TGACTAGGGAGGGTATATTGTCGTCTCCGCCGGCCCGCTTCCAACCGCTCCCCTAGTGGG 

2850 EAG1 XMA3, 

ProSerValAlaSerSerSerAlaSerGlnLeuSerAlaProSerLeuLysAlaThrCys 
2882 CCCTCTGTGGCCAGCTCCTCGGCTAGCCAGCTATCCGCTCCATCTCTCAAGGCAACTTGC 
GGGAGACACCGGTCGAGGAGCCGATCGGTCGATAGGCGAGGTAGAGAGTTCCGTTGAACG 

2889 BALI , 2903 NHEI, 

ThrAlaAsnHisAspSerProAspAlaGluLeuIleGluAlaAsnLeuLeuTrpArgGln 
2942 ACCGCTAACCATGACTCCCCTGATGCTGAGCTCATAGAGGCCAACCTCCTATGGAGGCAG 
TGGCGATTGGTACTGAGGGGACTACGACTCGAGTATCTCCGGTTGGAGGATACCTCCGTC 

2966 ESP1, 2969 SACI, 

GluMetGlyGlyAsnlleThrArgValGluSerGluAsnLysValVallleLeuAspSer 
3002 GAGATGGGCGGCAACATCACCAGGGTTGAGTCAGAAAACAAAGTGGTGATTCTGGACTCC 
CTCTACCCGCCGTTGTAGTGGTCCCAACTCAGTCTTTTGTTTCACCACTAAGACCTGAGG 

PheAspProLeuValAlaGluGluAspGluArgGluIleS rValProAlaGluIleLeu 
3062 TTCGATCCGCTTGTGGCGGAGGAGGACGAGCGGGAGATCTCCGTACCCGCAGAAATCCTG 
AAGCTAGGCGAACACCGCCTCCTCCTGCTCGCCCTCTAGAGGCATGGGCGTCTTTAGGAC 

3096 BGL2, 

ArgLysSerArgArgPheAlaGlnAiaLeuProValTrpAlaArgProAspTyrAsnPro 
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3122 CGGAAGTCTCGGAGATTCGCCCAGGCCCTGCCCGTTTGGGCGCGGCCGGACTATAACCCC 
GCCTTCAGAGCCTCTAAGCGGGTCCGGGACGGGCAAACCCGCGCCGGCCTGATATTGGGG 

3143 ALWN1, 3164 EAG1 XMA3, 

ProLeuValGluThrTrpLysLysProAspTyrGluProProValValHisGlyCysPro 
3182 CCGCTAGTGGAGACGTGGAAAAAGCCCGACTACGAACCACCTGTGGTCCATGGCTGCCCG 
GGCGATCACCTCTGCACCTTTTTCGGGCTGATGCTTGGTGGACACCAGGTACCGACGGGC 

3217 HGIE2, 3229 NCOI, 

LeuProProProLysSerProProValProProProArgLysLysArgThrValValLeu 
324 2 CTTCCACCTCCAAAGTCCCCTCCTGTGCCTCCGCCTCGGAAGAAGCGGACGGTGGTCCTC 
GAAGGTGGAGGTTTCAGGGGAGGACACGGAGGCGGAGCCTTCTTCGCCTGCCACCAGGAG 

ThrGluSerThrLeuSerThrAlaLeuAlaGluLeuAlaThrArgSerPheGlySerSer 
3302 ACTGAATCAACCCTATCTACTGCCTTGGCCGAGCTCGCCACCAGAAGCTTTGGCAGCTCC 
TGACTTAGTTGGGATAGATGACGGAACCGGCTCGAGCGGTGGTCTTCGAAACCGTCGAGG 

3332 SACI, 334 6 KIND3, 

SerThrSerGlylleThrGlyAspAsnThrThrThrSerSerGluProAlaProSerGly 
3362 TCAACTTCCGGCATTACGGGCGACAATACGACAACATCCTCTGAGCCCGCCCCTTCTGGC 
AGTTGAAGGCCGTAATGCCCGCTGTTATGCTGTTGTAGGAGACTCGGGCGGGGAAGACCG 

CysProProAspSerAspAlaGluSerTyrSerSerMetProProLeuGluGlyGluPro 
34 22 TGCCCCCCCGACTCCGACGCTGAGTCCTATTCCTCCATGCCCCCCCTGGAGGGGGAGCCT 
ACGGGGGGGCTGAGGCTGCGACTCAGGATAAGGAGGTACGGGGGGGACCTCCCCCTCGGA 

3437 EAM11051, 

GlyAspProAspLeuSerAspGlySerTrpSerThrValSerSerGluAlaAsnAlaGlu 
34 82 GGGGATCCGGATCTTAGCGACGGGTCATGGTCAACGGTCAGTAGTGAGGCCAACGCGGAG 
CCCCTAGGCCTAGAATCGCTGCCCAGTACCAGTTGCCAGTCATCACTCCGGTTGCGCCTC 

3484 BAMHI , 3485 BSAB1, 3487 BSPE1, 

AspValValCysCysSerMetSerTyrSerTrpThrGlyAlaLeuValThrProCysAla 
3542 GATGTCGTGTGCTGCTCAATGTCTTACTCTTGGACAGGCGCACTCGTCACCCCGTGCGCC 
CTACAGCACACGACGAGTTACAGAATGAGAACCTGTCCGCGTGAGCAGTGGGGCACGCGG 

3589 DRA3, 3600 SAC 2, 

AlaGluGluGlnLysLeuProIleAsnAlaLeuSerAsnSerLeuLeuArgHisHisAsn 1 
3602 GCGGAAGAACAGAAACTGCCCATCAATGCACTAAGCAACTCGTTGCTACGTCACCACAAT 
CGCCTTCTTGTCTTTGACGGGTAGTTACGTGATTCGTTGAGCAACGATGCAGTGGTGTTA 

3611 ALWN1, 3655 PFLM1, 

LeuValTyrSerThrThrSerArgSerAlaCysGlnArgGlnLysLysValThrPheAsp 
3662 TTGGTGTATTCCACCACCTCACGCAGTGCTTGCCAAAGGCAGAAGAAAGTCACATTTGAC 
AACCACATAAGGTGGTGGAGTGCGTCACGAACGGTTTCCGTCTTCTTTCAGTGTAAACTG 

368 1 DRA3, 

ArgLeuGlnValL uAspS rHisTyrGlnAspValLeuLysGluValLysAlaAlaAla 
3722 AGACTGCAAGTTCTGGACAGCCATTACCAGGACGTACTCAAGGAGGTTAAAGCAGCGGCG 
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TCTGACGTTCAAGACCTGTCGGTAATGGTCCTGCATGAGTTCCTCCAATTTCGTCGCCGC 

SerLysValLysAlaAsnLeuLeuSerValGluGliaAlaCysSerLeuThrProProHis 
3782 TCAAAAGTGAAGGCTAACTTGCTATCCGTAGAGGAAGCTTGCAGCCTGACGCCCCCACAC 
AGTTTTCACTTCCGATTGAACGATAGGCATCTCCTTCGAACGTCGGACTGCGGGGGTGTG 

3816 HIND3, 

SerAlaLysSerLysPheGlyTyrGlyAlaLysAspValArgCysHisAlaArgLysAla 
3842 TCAGCCAAATCCAAGTTTGGTTATGGGGCAAAAGACGTCCGTTGCCATGCCAGAAAGGCC 
AGTCGGTTTAGGTTCAAACCAATACCCCGTTTTCTGCAGGCAACGGTACGGTCTTTCCGG 

3875 AAT2, 3890 BGLI , 

ValThrHisIleAsnSerValTrpLysAspLeuLeuGluAspAsnValThrProIleAsp 
3902 GTAACCCACATCAACTCCGTGTGGAAAGACCTTCTGGAAGACAATGTAACACCAAT AGAC 
CATTGGGTGTAGTTGAGGCACACCTTTCTGGAAGACCTTCTGTTACATTGTGGTTATCTG 

ThrThrlleMetAlaLysAsnGluValPheCysValGlnProGluLysGlyGlyArgLys 
3962 actaccatcatggctaagaacgaggttttctgcgttcagcctga^aaggggggtcgtaag 
tgatggtagtaccgattcttgctccaaaagacgcaagtcggactcttccccccagcattc 

ProAlaArgLeuIleValPheProAspLeuGlyValArgValCysGluLysMetAlaLeu 
4 022 CCAGCTCGTCTCATCGTGTTCCCCGATCTGGGCGTGCGCGTGTGCGAAAAGATGGCTTTG 
GGTCGAGCAGAGTAGCACAAGGGGCTAGACCCGCACGCGCACACGCTTTTCTACCGAAAC 

TyrAspValValThrLysLeuProLeuAlaValMetGlySerSerTyrGlyPheGlnTyr 
4082 TACGACGTGGTTACAAAGCTCCCCTTGGCCGTGATGGGAAGCTCCTACGGATTCCAAT AC 
ATGCTGCACCAATGTTTCGAGGGGAACCGGCACTACCCTTCGAGGATGCCTAAGGTTATG 

SerProGlyGlnArgValGluPheLeuValGlnAlaTrpLysSerLysLysThrProMet 
414 2 TCACCAGGACAGCGGGTTGAATTCCTCGTGCAAGCGTGGAAGTCCAAGAAAACCCCAATG 
AGTGGTCCTGTCGCCCAACTTAAGGAGCACGTTCGCACCTTCAGGTTCTTTTGGGGTTAC 

4160 ECORI, 

GlyPheSerTyrAspThrArgCysPheAspSerThrValThrGluSerAspIleArgThr 
4 202 GGGTTCTCGTATGATACCCGCTGCTTTGACTCCACAGTCACTGAGAGCGACATCCGTACG 
CCCAAGAGCATACTATGGGCGACGAAACTGAGGTGTCAGTGACTCTCGCTGTAGGCATGC 

4229 DRD1 , 4236 ALWN1, 

GluGluAlalleTyrGlnCysCysAspLeuAspProGlnAlaArgValAlalleLysSer 
4 262 GAGGAGGCAATCTACCAATGTTGTGACCTCGACCCCCAAGCCCGCGTGGCCATCAAGTCC 
CTCCTCCGTTAGATGGTTACAACACTGGAGCTGGGGGTTCGGGCGCACCGGTAGTTCAGG 

4301 BGLI , 4308 BALI, 

LeuThrGluArgLeuTyrValGlyGlyProLeuThrAsnSerArgGlyGluAsnCysGly 
4322 " CTCACCGAGAGGCTTTATGTTGGGGGCCCTCTTACCAATTCAAGGGGGGAGAACTGCGGC 
GAGTGGCTCTCCGAAATACAACCCCCGGGAGAATGGTTAAGTTCCCCCCTCTTGACGCCG 

4 34 5 APAI, 

TyrArgArgCysArgAlaSerGlyValL uThrThrS rCysGlyAsnThrLeuThrCys 
4 382 t ATCGCAGGTGCCGCGCGAGCGGCGTACTGACAACTAGCTGTGGTAACACCCTCACTTGC 
ATAGCGTCCACGGCGCGCTCGCCGCATGACTGTTGATCGACACCATTGTGGGAGTGAACG 
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TyrlleLysAlaArgAlaAlaCysArgAlaAlaGlyLeuGlnAspCysThrMetLeuVal 
4 442 TACATCAAGGCCCGGGCAGCCTGTCGAGCCGCAGGGCTCCAGGACTGCACCATGCTCGTG 
ATGTAGTTCCGGGCCCGTCGGACAGCTCGGCGTCCCGAGGTCCTGACGTGGTACGAGCAC 

4452 SMAI XMAI, 

CysGlyAspAspLeuValVallleCysGluSerAlaGlyValGlnGluAspAlaAlaSer 
4 502 TGTGGCGACGACTTAGTCGTT ATCTGTGAAAGCGCGGGGGTCCAGGAGG ACGCGGCGAGC 
ACACCGCTGCTGAATCAGCAATAGACACTTTCGCGCCCCCAGGTCCTCCTGCGCCGCTCG 

4508 DRD1, 4511 TTH3I, 

LeuArgAlaPheThrGluAlaMetThrArgTyrSerAlaProProGlyAspProProGln 
4 562 CTGAGAGCCTTCACGGAGGCTATGACCAGGTACTCCGCCCCCCCTGGGGACCCCCCACAA 
GACTCTCGGAAGTGCCTCCGATACTGGTCCATGAGGCGGGGGGGACCCCTGGGGGGTGTT 

ProGluTyrAspLeuGluLeuIleThrSerCysSerSerAsnValSerValAlaHisAsp 
4 622 CCAGAATACGACTTGGAGCTCATAACATCATGCTCCTCCAACGTGTCAGTCGCCCACGAC 
GGTCTTATGCTGAACCTCGAGTATTGTAGTACGAGGAGGTTGCACAGTCAGCGGGTGCTG 

4 637 SACI, 

GlyAlaGlyLysArgValTyrTyrLeuThrArgAspProThrThrProLeuAlaArgAla 
4 682 GGCGCTGGAAAGAGGGTCTACTACCTCACCCGTGACCCTACAACCCCCCTCGCGAGAGCT 
CCGCGACCTTTCTCCCAGATGATGGAGTGGGCACTGGGATGTTGGGGGGAGCGCTCTCGA 

4731 NRUI, 

AlaTrpGluThrAlaArgHisThrProValAsnSerTrpLeuGlyAsnllelleMetPhe 
4 7 4 2 GCGTGGGAGACAGCAAGACACACTCCAGTCAATTCCTGGCTAGGCAACATAATCATGTTT 
CGCACCCTCTGTCGTTCTGTGTGAGGTCAGTTAAGGACCGATCCGTTGTATTAGTACAAA 

Al a Pr oThr LeuT rpAl aAr gMe 1 1 1 eLe uMe t Thr Hi s Phe Phe S e rVa 1 Le u 1 1 eAl a 
4 802 GCCCCCACACTGTGGGCGAGGATGATACTGATGACCCATTTCTTTAGCGTCCTTATAGCC 
CGGGGGTGTGACACCCGCTCCTACTATGACTACTGGGTAAAGAAATCGCAGGAATATCGG 

4806 PFLM1, 4807 DRA3, 

ArgAspGlnLeuGluGlnAlaLeuAspCysGluIleTyrGlyAlaCysTyrSerlleGlu 
4 862 AGGGACCAGCTTGAACAGGCCCTCGATTGCGAGATCTACGGGGCCTGCT ACTCCAT AGAA 
TCCCTGGTCGAACTTGTCCGGGAGCTAACGCTCTAGATGCCCCGGACGATGAGGTATCTT 

4 893 BGL2, 

ProLeuAspLeuProProIlelleGlnArgLeuHisGlyLeuSerAlaPheSerLeuHis 
4 922 CCACTGGATCTACCTCCAATCATTCAAAGACTCCATGGCCTCAGCGCATTTTCACTCCAC 
GGTGACCTAGATGGAGGTTAGTAAGTTTCTGAGGTACCGGAGTCGCGTAAAAGTGAGGTG 

4 954 NCOI, 

SerTyrSerProGlyGluIleAsnArgValAlaAlaCysLeuArgLysLeuGlyValPro 
4982^ AGTT ACTCTCCAGGTGAAATCAATAGGGTGGCCGCATGCCTCAGAAAACTTGGGGTACCG 
TCAATGAGAGGTCCACTTTAGTTATCCCACCGGCGTACGGAGTCTTTTGAACCCCATGGC 

5015 SPHI, 5035 KPNI, 
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ProLeuArgAlaTrpArgHisArgAiaArgSerValArgAlaArgLeuLeuAlaArgGly 
CCCTTGCGAGCTTGGAGACACCGGGCCCGGAGCGTCCGCGCTAGGCTTCTGGCCAGAGGA 
GGGAACGCTCGAACCTCTGTGGCCCGGGCCTCGCAGGCGCGATCCGAAGACCGGTCTCCT 

5064 APAI, 5091 BALI, 

GlyArgAlaAlalleCysGlyLysTyrLeuPheAsnTrpAlaValArgThrLysLeuLys 
GGCAGGGCTGCCATATGTGGCAAGTACCTCTTCAACTGGGCAGTAAGAACAAAGCTCAAA 
CCGTCCCGACGGTATACACCGTTCATGGAGAAGTTGACCCGTCATTCTTGTTTCGAGTTT 

5113 NDEI , 

LeuThrProIleAlaAlaAlaGlyGlnLeuAspLeuSerGlyTrpPheThrAlaGlyTyr 
CTCACTCCAATAGCGGCCGCTGGCCAGCTGGACTTGTCCGGCTGGTTCACGGCTGGCTAC 
GAGTGAGGTTATCGCCGGCGACCGGTCGACCTGAACAGGCCGACCAAGTGCCGACCGATG 

5174 NOTI, 5175 EAG1 XMA3, 5182 BALI, 5186 PVU2, 

SerGlyGlyAspIleTyrHisSerValSerHisAlaArgProArgTrpIleTrpPheCys 
AGCGGGGGAGACATTTATCACAGCGTGTCTCATGCCCGGCCCCGCTGGATCTGGTTTTGC 
TCGCCCCCTCTGTAAATAGTGTCGCACAGAGTACGGGCCGGGGCGACCTAGACCAAAACG 



LeuLeuLeuLeuAlaAlaGlyValGlylleTyrLeuLeuProAsnArgMetSerThrAsn 
5282 CTACTCCTGCTTGCTGCAGGGGTAGGCATCTACCTCCTCCCCAACCGAATGAGCACGAAT 
GATGAGGACGAACGACGTCCCCATCCGTAGATGGAGGAGGGGTTGGCTTACTCGTGCTTA 

5295 PSTI, 

ProLysProGlnArgLysThrLysArgAsnThrAsnArgArgProGlnAspValLysPhe 
5342 CCTAAACCTCAAAGAAAGACCAAACGTAACACCAACCGGCGGCCGCAGGACGTCAAGTTC 
GGATTTGGAGTTTCTTTCTGGTTTGCATTGTGGTTGGCCGCCGGCGTCCTGCAGTTCAAG 

5380 NOTI, 5381 EAG1 XMA3, 5390 AAT2, 5401 SMAI XMAI , 

ProGlyGlyGlyGlnlleValGlyGlyValTyrLeuLeuProArgArgGlyProArgLeu 
5402 CCGGGTGGCGGTCAGATCGTTGGTGGAGTTTACTTGTTGCCGCGCAGGGGCCCTAGATTG 
GGCCCACCGCCAGTCTAGCAACCACCTCAAATGAACAACGGCGCGTCCCCGGGATCTAAC 

54 4 9 APAI, 

GlyValArgAlaThrArgLysThrSerGluArgSerGlnProArgGlyArgArgGlnPro 
54 62 GGTGTGCGCGCGACGAGAAAGACTTCCGAGCGGTCGCAACCTCGAGGTAGACGTCAGCCT 
CCACACGCGCGCTGCTCTTTCTGAAGGCTCGCCAGCGTTGGAGCTCCATCTGCAGTCGGA 

5467 BSSH2, 5478 XMNI, 5502 XHOI, 5511 AAT2, 

' IleProLysAlaArgArgProGluGlyArgThrTrpAlaGlnProGlyTyrProTrpPro 
5522 ATCCCCAAGGCTCGTCGGCCCGAGGGCAGGACCTGGGCTCAGCCCGGGTACCCTTGGCCC 
'■■ TAGGGGTTCCGAGCAGCCGGGCTCCCGTCCTGGACCCGAGTCGGGCCCATGGGAACCGGG 

5548 ALWN1, 5558 ESPl, 5564 SMAI XMAI, 5568 KPNI, 

LeuTyrGlyAsnGluGlyCysGlyTrpAlaGlyTrpLeuLeuSerProArgGlySerArg 
5582 CTCTATGGCAATGAGGGCTGCGGGTGGGCGGGATGGCTCCTGTCTCCCCGTGGCTCTCGG 
GAGATACCGTTACTCCCGACGCCCACCCGCCCTACCGAGGACAGAGGGGCACCGAGAGCC 
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ProSerTrpGlyProThrAspProArgArgArgSerArgAsnLeuGlyLysVallleAsp 
5642 CCTAGCTGGGGCCCCACAGACCCCCGGCGTAGGTCGCGCAATTTGGGTAAGGTCATCG AT 
GGATCGACCCCGGGGTGTCTGGGGGCCGCATCCAGCGCGTTAAACCCATTCCAGTAGCTA 

5650 APAI, 5696 CLAI, 

Thrl*euThrCysGlyPheAlaAspLeu^SetGlyTyrIleProLeuValGlyAlaProLeu 
5702 ACCCTTACGTGCGGCTTCGCCGACCTCATGGGGTACATACCGCTCGTCGGCGCCCCTCTT 
TGGGAATGCACGCCGAAGCGGCTGGAGTACCCCATGTATGGCGAGCAGCCGCGGGGAGAA 

5724 HGIE2, 5750 KAS1 NARI, 5756 ECON1, 

GlyGlyAlaAlaArgAlaLeuAlaHisGlyValArgValLeuGluAspGlyValAsnTyr 
5762 GGAGGCGCTGCCAGGGCCCTGGCGCATGGCGTCCGGGTTCTGGAAGACGGCGTGAACTAT 
CCTCCGCGACGGTCCCGGGACCGCGTACCGCAGGCCCAAGACCTTCTGCCGCACTTGATA 

5772 BSTXI, 5775 APAI, 

AlaThrGlyAsnLeuProGlyCysSerOC AM 
5822 GCAACAGGGAACCTTCCTGGTTGCTCTTAATAGTCGAC 
CGTTGTCCCTTGGAAGGACCAACGAGAATTATCAGCTG 

5854 SALI, 
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MetAlaAlaTyrAlaAlaGlnGlyTyrLysValLeuValLeuAsn 
2 AGCTTACAAAACAAAATGGCTGCATATGCAGCTCAGGGCTATAAGGTGCTAGTACTCAAC 
TCGAATGTTTTGTTTTACCGACGTATACGTCGAGTCCCGATATTCCACGATCATGAGTTG 

1 HIND3, 24 NDEI, 52 SCAI, 

ProSerValAlaAlaThrLeuGlyPheGlyAlaTyrMetSerLysAlaHisGlylleAsp 
62 CCCTCTGTTGCTGCAACACTGGGCTTTGGTGCTTACATGTCCAAGGCTCATGGGATCGAT 
GGGAGACAACGACGTTGTGACCCGAAACCACGAATGTACAGGTTCCGAGTACCCTAGCTA 

116 CLAI, 

ProAsnlleArgThrGlyValArgThrlleThrThrGlySerProIleThrTyrSerThr 
122 CCTAACATCAGGACCGGGGTGAGAACAATTACCACTGGCAGCCCCATCACGTACTCCACC 
GGATTGTAGTCCTGGCCCCACTCTTGTTAATGGTGACCGTCGGGGTAGTGCATGAGGTGG 

TyrGlyLysPheLeuAlaAspGlyGlyCysSerGlyGlyAlaTyrAspIleXlelleCys 
182 TACGGCAAGTTCCTTGCCGACGGCGGGTGCTCGGGGGGCGCTTATGACATAATAATTTGT 
ATGCCGTTCAAGGAACGGCTGCCGCCCACGAGCCCCCCGCGAATACTGTATTATTAAACA 

AspGluCysHisSerThrAspAlaThrSerXleLeuGlylleGlyThrValLeuAspGln 
242 GACGAGTGCCACTCCACGGATGCCACATCCATCTTGGGCATTGGCACTGTCCTTGACCAA 
CTGCTCACGGTGAGGTGCCTACGGTGTAGGTAGAACCCGTAACCGTGACAGGAACTGGTT 

AlaGluThrAlaGlyAlaArgLeuValValLeuAlaThrAlaThrProProGlySerVal 
302 GCAGAGACTGCGGGGGCGAGACTGGTTGTGCTCGCCACCGCCACCCCTCCGGGCTCCGTC 
CGTCTCTGACGCCCCCGCTCTGACCAACACGAGCGGTGGCGGTGGGGAGGCCCGAGGCAG 

303 ALWN1, 

ThrValProHisProAsnlleGluGluValAlaLeuSerThrThrGlyGluIleProPhe 
362 ACTGTGCCCCATCCCAACATCGAGGAGGTTGCTCTGTCCACCACCGGAGAGATCCCTTTT 
TGACACGGGGTAGGGTTGTAGCTCCTCCAACGAGACAGGTGGTGGCCTCTCTAGGGAAAA 

TyrGlyLysAlaXleProLeuGluValXleLysGlyGiyArgilisLeuIlePheCysHis 
422 TACGGCAAGGCTATCCCCCTCGAAGTAATCAAGGGGGGGAGACATCTCATCTTCTGTCAT 
ATGCCGTTCCGATAGGGGGAGCTTCATTAGTTCCCCCCCTCTGTAGAGTAGAAGACAGTA 
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SerLysLysLysCysAspGluLSuAlaAlaLysLeuValAXaLeuGlyXleAsnAlaVal 
482 TCAAAGAAGAAGTGCGACGAACTCGCCGCAAAGCTGGTCGCATTGGGCATCAATGCCGTG 
AGTTTCTTCTTCACGCTGCTTGAGCGGCGTTTCGACCAGCGTAACCCGTAGTTACGGCAC 

AlaTyrTyrArgGlyLeuAspValSerVallleProThrSerGlyAspValValValVal 
542 GCCTACTACCGCGGTCTTGACGTGTCCGTCATCCCGACCAGCGGCGATGTTGTCGTCGTG 
CGGATGATGGCGCCAGAACTGCACAGGCAGTAGGGCTGGTCGCCGCTACAACAGCAGCAC 

550 SAC2, 560 DRD1, 

AlaThrAspAlaLeuMetThrGlyTyrThrGlyAspPheAspSerVallleAspCysAsn 
602 GC AACCGATGCCCTCATGACCGGCTATACCGGCGACTTCGACTCGGTGATAGACTGCAAT 
CGTTGGCTACGGGAGTACTGGCCGATATGGCCGCTGAAGCTGAGCCACTATCTGACGTTA 

615 BSPH1, 

ThrCysValThrGlnThrValAspPheSerLeuAspProThrPheThrlleGluThrlle 
662 ACGTGTGTCACCCAGACAGTCGATTTCAGCCTTGACCCTACCTTCACCATTGAGACAATC 
TGCACACAGTGGGTCTGTCAGCTAAAGTCGGAACTGGGATGGAAGTGGTAACTCTGTTAG 

ThrLeuProGlnAspAlaValSerArgThrGlnArgArgGlyAxgThrGlyArgGlyLys 
722 ACGCTCCCCCAAGATGCTGTCTCCCGCACTCAACGTCGGGGCAGGACTGGCAGGGGGAAG 
TGCGAGGGGGTTCTACGACAGAGGGCGTGAGTTGCAGCCCCGTCCTGACCGTCCCCCTTC 

ProGlylleTyrArgPheValAlaProGlyGluArgProSerGlyMetPheAspSerSer 
782 CCAGGCATCTACAGATTTGTGGCACCGGGGGAGCGCCCCTCCGGCATGTTCGACTCGTCC 
GGTCCGTAGATGTCTAAACACCGTGGCCCCCTCGCGGGGAGGCCGTACAAGCTGAGCAGG 

816 BGLI , 833 DRD1, 

ValLeuCysGluCysTyrAspAlaGlyCysAlaTrpTyrGluLeuThrProAlaGluThr 
84 2 GTCCTCTGTGAGTGCTATGACGCAGGCTGTGCTTGGTATGAGCTCACGCCCGCCGAGACT 
CAGGAGACACTCACGATACTGCGTCCGACACGAACCATACTCGAGTGCGGGCGGCTCTGA 

881 SACI, 

ThrValArgLeuArgAlaTyrMetAsnThrProGlyLeuProValCysGlnAspHisLeu 
902 ACAGTTAGGCTACGAGCGTACATGAACACCCCGGGGCTTCCCGTGTGCCAGGACCATCTT 
TGTCAATCCGATGCTCGCATGTACTTGTGGGGCCCCGAAGGGCACACGGTCCTGGTAGAA 

931 SMAI XMAI, 

GluPheTrpGluGlyValPheThrGlyLeuThrHisIleAspAlaHisPheLeuSerGln 
962 GAATTTTGGGAGGGCGTCTTTACAGGCCTCACTCATATAGATGCCCACTTTCTATCCCAG 
CTTAAAACCCTCCCGCAGAAATGTCCGGAGTGAGTATATCTACGGGTGAAAGATAGGGTC 

c ' 985 STUI, 

ThrLysGlnSerGlyGluAsnLeuPsroTyrLeuValAlaTyrGlnAlaThrValCysAla 
.1022 ACAAAGCAGAGTGGGGAGAACCTTCCTTACCTGGTAGCGTACCAAGCCACCGTGTGCGCT 
TGTTTCGTCTCACCCCTCTTGGAAGGAATGGACCATCGCATGGTTCGGTGGCACACGCGA 

1069 DRA3, 

ArgAlaGlnAlaProProProSerTrpAspGlnMetTrpLysCysLeuIleArgLeuLys 
1082 AGGGCTCAAGCCCCTCCCCCATCGTGGGACCAGATGTGGAAGTGTTTGATTCGCCTCAAG 
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TCCCGAGTTCGGGGAGGGGGTAGCACCCTGGTCTACACCTTCACAAACTAAGCGGAGTTC 

ProThrLeuHisGlyProThrProLeuLeuTysrArgLeuGlyAlaValGlnAsnGluIle 
1142 CCCACCCTCCATGGGCCAACACCCCTGCTATACAGACTGGGCGCTGTTCAGAATGAAATC 
GGGTGGGAGGTACCCGGTTGTGGGGACGATATGTCTGACCCGCGACAAGTCTTACTTTAG 

1150 NCOI, 

ThTLeuThrHisProValThrLysTyrlleMetThrCysMetSerAlaAspLeuGluVal 
1202 ACCCTGACGCACCCAGTCACCAAATACATCATGACATGCATGTCGGCCGACCTGGAGGTC 
TGGGACTGCGTGGGTCAGTGGTTTATGTAGTACTGTACGTACAGCCGGCTGGACCTCCAG 

1230 BSPH1, 1234 DRD1, 1237 AVA3, 1245 EAG1 XMA3, 1250 DRD1, 



ValThrSerThrTrpValLeuValGlyGlyValLeuAlaAlaLeuAlaAlaTyrCysLeu 
1262 GTCACGAGCACCTGGGTGCTCGTTGGCGGCGTCCTGGCTGCTTTGGCCGCGTATTGCCTG 
CAGTGCTCGTGGACCCACGAGCAACCGCCGCAGGACCGACGAAACCGGCGCATAACGGAC 

SerThrGlyCysValVallleValGlyArgValValLeuSerGlyLysProAlallelle 
1322 TCAACAGGCTGCGTGGTCATAGTGGGCAGGGTCGTCTTGTCCGGGAAGCCGGCAATCATA 
AGTTGTCCGACGCACCAGTATCACCCGTCCCAGCAGAACAGGCCCTTCGGCCGTTAGTAT 

1369 NAEI, 

ProAspArgGluValLeuTyrArgGluPheAspGluMetGluGluCysSerGlnHisLeu 
1382 CCTGACAGGGAAGTCCTCTACCGAGAGTTCGATGAGATGGAAGAGTGCTCTCAGCACTTA 
GGACTGTCCCTTCAGGAGATGGCTCTCAAGCTACTCTACCTTCTCACGAGAGTCGTGAAT 

1385 DRD1, 

ProTyrlleGluGlnGlyMetMetLeuAlaGluGlnPheLysGlnLysAlaLeuGlyLeu 
1442 CCGTACATCGAGCAAGGGATGATGCTCGCCGAGCAGTTCAAGCAGAAGGCCCTCGGCCTC 
GGCATGTAGCTCGTTCCCTACTACGAGCGGCTCGTCAAGTTCGTCTTCCGGGAGCCGGAG 

LeuGlnThrAlaSerArgGlnAlaGluVallleAlaProAlaValGlnThrAsnTrpGln 
1502 CTGCAGACCGCGTCCCGTCAGGCAGAGGTTATCGCCCCTGCTGTCCAGACCAACTGGCAA 
GACGTCTGGCGCAGGGCAGTCCGTCTCCAATAGCGGGGACGACAGGTCTGGTTGACCGTT 

1502 PSTI, 1507 TTH3I, 

LysLeuGluThrPheTrpAlaLysHisMetTrpAsnPhelleSerGlylleGlnTyrLeu 
1562 AAACTCGAGACCTTCTGGGCGAAGCATATGTGGAACTTCATCAGTGGGATACAATACTTG 
TTTGAGCTCTGGAAGACCCGCTTCGTATACACCTTGAAGTAGTCACCCTATGTTATGAAC 

1565 XHOI, 1586 NDEI, 

AlaGlyLeuSerThrLeuProGlyAsnProAlalleAlaSerLeuMetAlaPheThrAla 
1622 GCGGGCTTGTCAACGCTGCCTGGTAACCCCGCCATTGCTTCATTGATGGCTTTTACAGCT 
' , CGCCCGAACAGTTGCGACGGACCATTGGGGCGGTAACGAAGTAACTACCGAAAATGTCGA 

1643 BSTE2, 1677 ALWN1 PVU2, 

AlaValThrSerProLeuThrThrS rGlnThrLeuL uPheAsnlleLeuGlyGlyTrp 
1682 GCTGTCACCAGCCCACTAACCACTAGCCAAACCCTCCTCTTCAACATATTGGGGGGGTGG 
CGACAGTGGTCGGGTGATTGGTGATCGGTTTGGGAGGAGAAGTTGTATAACCCCCCCACC 
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ValAlaAlaGlnLeuAlaAlaProGlyAlaAlaThrAlaPheValGlyAlaGlyLeuAla 
1742 GTGGCTGCCCAGCTCGCCGCCCCCGGTGCCGCTACTGCCTTTGTGGGCGCTGGCTTAGCT 
CACCGACGGGTCGAGCGGCGGGGGCCACGGCGATGACGGAAACACCCGCGACCGAATCGA 

1794 ESP1, 

GlyAlaAlalleGlySerValGlyLeuGlyLysValLeuIleAspIleLeuAlaGlyTyr 
1802 GGCGCCGCCATCGGCAGTGTTGGACTGGGGAAGGTCCTCATAGACATCCTTGCAGGGTAT 
CCGCGGCGGTAGCCGTCACAACCTGACCCCTTCCAGGAGTATCTGTAGGAACGTCCCATA 

1802 KAS1 NARI, 

GlyAlaGlyValAlaGlyAlaLeuValAlaPheLysIleMetSerGlyGluValProSer 
18 62 GGCGCGGGCGTGGCGGGAGCTCTTGTGGCATTCAAGATCATGAGCGGTGAGGTCCCCTCC 
CCGCGCCCGCACCGCCCTCGAGAACACCGTAAGTTCTAGTACTCGCCACTCCAGGGGAGG 

1878 SACI, 1899 BSPH1, 

ThrGluAspLeuValAsnLeuLeuProAlalleLeuSerProGlyAlaLeuValValGly 
1922 ACGGAGGACCTGGTCAATCTACTGCCCGCCATCCTCTCGCCCGGAGCCCTCGTAGTCGGC 
TGCCTCCTGGACCAGTTAGATGACGGGCGGTAGGAGAGCGGGCCTCGGGAGCATCAGCCG 

1928 TTH3I, 

ValValCysAlaAlalleLeuArgArgHisValGlyProGlyGluGlyAlaValGlnTrp 
1982 GTGGTCTGTGCAGCAATACTGCGCCGGCACGTTGGCCCGGGCGAGGGGGCAGTGCAGTGG 
CACCAGACACGTCGTTATGACGCGGCCGTGCAACCGGGCCCGCTCCCCCGTCACGTCACC 

2004 NAEI, 2017 SMAI XMAI, 

MetAsnArgLeuIleAlaPheAlaSerArgGlyAsnHisValSerProThrHisTyrVal 
2042 ATGAACCGGCTGATAGCCTTCGCCTCCCGGGGGAACCATGTTTCCCCCACGCACTACGTG 
TACTTGGCCGACTATCGGAAGCGGAGGGCCCCCTTGGTACAAAGGGGGTGCGTGATGCAC 

2067 SMAI XMAI, 2093 DRA3, 

ProGluSerAspAlaAlaAlaArgValThrAlalleLeuSerSerLeuThrValThrGln 
2102 CCGGAGAGCGATGCAGCTGCCCGCGTCACTGCCATACTCAGCAGCCTCACTGTAACCCAG 
GGCCTCTCGCTACGTCGACGGGCGCAGTGACGGTATGAGTCGTCGGAGTGACATTGGGTC 

2115 PVU2, 2159 ALWN1, 

LeuLeuArgArgLeuHisGlnTrpIleSerSerGluCysThrThrProCysSerGlySer 
2162 CTCCTGAGGCGACTGCACCAGTGGATAAGCTCGGAGTGTACCACTCCATGCTCCGGTTCC 
GAGGACTCCGCTGACGTGGTCACCTATTCGAGCCTCACATGGTGAGGTACGAGGCCAAGG 

- " .2164 MST2, 2220 ECON1, 

TrpLexiArgAspIleTrpAspTrpIleCysGluValLeuSerAspPheLysThrTrpLeu 
2222. TGGCTAAGGGACATCTGGGACTGGATATGCGAGGTGTTGAGCGACTTTAAGACCTGGCTA 
ACCGATTCCCTGTAGACCCTGACCTATACGCTCCACAACTCGCTGAAATTCTGGACCGAT 

: LysAlaLysLeuMetProGlnLeuProGlylleProPheValSerCysGlnArgGlyTyr 
2282 AAAGCTAAGCTCATGCCACAGCTGCCTGGGATCCCCTTTGTGTCCTGCCAGCGCGGGTAT 
TTTCGATTCGAGTACGGTGTCGACGGACCCTAGGGGAAACACAGGACGGTCGCGCCCATA 

2285 ESP1, 2300 PVU2, 2310 BAMHI, 
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LysGlyValTrpArgGlyAspGlyll MetHisThrArgCysHisCysGlyAlaGluIle 
2342 AAGGGGGTCTGGCGAGGGGACGGCATCATGCACACTCGCTGCCACTGTGGAGCTGAGATC 
TTCCCCCAGACCGCTCCCCTGCCGTAGTACGTGTGAGCGACGGTGACACCTCGACTCTAG 

ThrGlyHisValLysAsnGlyThrMetArglleValGlyProArgThrCysArgAsnMet 
2402 ACTGGACATGTCAAAAACGGGACGATGAGGATCGTCGGTCCTAGGACCTGCAGGAACATG 
TGACCTGTACAGTTTTTGCCCTGCTACTCCTAGCAGCCAGGATCCTGGACGTCCTTGTAC 

2425 BSAB1 , 2441 AVR2, 2448 SSE83871, 2449 PSTI, 

TrpSerGlyThrPheProIleAsnAlaTyrThrThrGlyProCysThrProLeuProAla 
24 62 TGGAGTGGGACCTTCCCCATTAATGCCTACACCACGGGCCCCTGTACCCCCCTTCCTGCG 
ACCTCACCCTGGAAGGGGTAATTACGGATGTGGTGCCCGGGGACATGGGGGGAAGGACGC 

2480 ASE1, 2497 APAI, 

ProAsnTyrThrPheAlaLeuTrpArgValSerAlaGluGluTyrValGluIleArgGln 
2522 CCGAACTACACGTTCGCGCTATGGAGGGTGTCTGCAGAGGAATACGTGGAGATAAGGCAG 
GGCTTGATGTGCAAGCGCGATACCTCCCACAGACGTCTCCTTATGCACCTCTATTCCGTC 

2553 PSTI, 

ValGlyAspPheHisTyrValThrGlyMetThrThrAspAsnLeuLysCysProCysGln 
2582 GTGGGGGACTTCCACTACGTGACGGGTATGACTACTGACAATCTTAAATGCCCGTGCCAG 
CACCCCCTGAAGGTGATGCACTGCCCATACTGATGACTGTTAGAATTTACGGGCACGGTC 

2594 DRA3, 

ValProSerProGluPhePheThrGluLeuAspGlyValArgLeuHisArgPheAlaPro 
2642 GTCCCATCGCCCGAATTTTTCACAGAATTGGACGGGGTGCGCCTACATAGGTTTGCGCCC 
CAGGGTAGCGGGCTTAAAAAGTGTCTTAACCTGCCCCACGCGGATGTATCCAAACGCGGG 

ProCysLysProLeuLeuArgGluGluValSerPheArgValGlyLeuHisGluTyrPro 
2702 CCCTGCAAGCCCTTGCTGCGGGAGGAGGT ATCATTCAG AGTAGGACTCCACGAATACCCG 
GGGACGTTCGGGAACGACGCCCTCCTCCATAGTAAGTCTCATCCTGAGGTGCTTATGGGC 

2757 HGIE2, 

ValGlySerGlnLeuProCysGluProGluProAspValAlaValLeuThrSerMetLeu 
2762 GTAGGGTCGCAATTACCTTGCGAGCCCGAACCGGACGTGGCCGTGTTGACGTCCATGCTC 
CATCCCAGCGTTAATGGAACGCTCGGGCTTGGCCTGCACCGGCACAACTGCAGGTACGAG 

2809 AAT2, 

ThrAspProSerHisIleThrAlaGluAlaAlaGlyArgArgLeuAlaArgGlySerPro 
2822 ACTGATCCCTCCCATATAACAGCAGAGGCGGCCGGGCGAAGGTTGGCGAGGGGATCACCC 
TGACTAGGGAGGGTATATTGTCGTCTCCGCCGGCCCGCTTCCAACCGCTCCCCTAGTGGG 

"2850 EAG1 XMA3, 

ProSerValAlaSerSerSerAlaSerGlnLeuSerAlaProSerLeuLysAlaThrCys 
2882 CCCTCTGTGGCCAGCTCCTCGGCTAGCCAGCTATCCGCTCCATCTCTCAAGGCAACTTGC 
GGGAGACACCGGTCGAGGAGCCGATCGGTCGATAGGCGAGGTAGAGAGTTCCGTTGAACG 

2889 BALI, 2903 NHEI, 
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ThrAlaAsnHisAspSerProAspAlaGluLeuIleGluAiaAsnLeuLeuTrpArgGln 
2942 ACCGCTAACCATGACTCCCCTGATGCTGAGCTCATAGAGGCCAACCTCCTATGGAGGCAG 
TGGCGATTGGTACTGAGGGGACTACGACTCGAGTATCTCCGGTTGGAGGATACCTCCGTC 

2966 ESP1, 2969 SACX, 

GluMetGlyGlyAsnXleThrArgValGluSerGluAsnLysValVallleLeuAspSer 
3002 GAGATGGGCGGCAACATCACCAGGGTTGAGTCAGAAAACAAAGTGGTGATTCTGGACTCC 
CTCTACCCGCCGTTGTAGTGGTCCCAACTCAGTCTTTTGTTTCACCACTAAGACCTGAGG 

PheAspProLeuValAlaGluGluAspGluArgGluIleSexValProAlaGluIleLeu 
3062 TTCGATCCGCTTGTGGCGGAGGAGGACGAGCGGGAGATCTCCGTACCCGCAGAAATCCTG 
AAGCTAGGCGAACACCGCCTCCTCCTGCTCGCCCTCTAGAGGCATGGGCGTCTTTAGGAC 

3096 BGL2, 

ArgLysSerArgArgPheAlaGlnAlaLeuProValTrpAXaArgProAspTyrAsnPro 
3122 CGGAAGTCTCGGAGATTCGCCCAGGCCCTGCCCGTTTGGGCGCGGCCGGACTATAACCCC 
GCCTTCAGAGCCTCTAAGCGGGTCCGGGACGGGCAAACCCGCGCCGGCCTGATATTGGGG 

314 3 ALWN1, 3164 EAG1 XMA3, 

ProLeuValGluThrTrpLysLysProAspTyrGluProProValValHisGlyCysPro 
3182 CCGCTAGTGGAGACGTGGAAAAAGCCCGACTACGAACCACCTGTGGTCCATGGCTGCCCG 
GGCGATCACCTCTGCACCTTTTTCGGGCTGATGCTTGGTGGACACCAGGTACCGACGGGC 

3217 HGIE2, 3229 NCOI, 

LeuProProProLysSerProProValProProProArgLysLysArgThrValValLeu 
3242 CTTCCACCTCCAAAGTCCCCTCCTGTGCCTCCGCCTCGGAAGAAGCGGACGGTGGTCCTC 
GAAGGTGGAGGTTTCAGGGGAGGACACGGAGGCGGAGCCTTCTTCGCCTGCCACCAGGAG 

ThrGluSerThrLeuSerThrAlaLeuAlaGluLeuAlaThrArgSerPheGlySerSer 
3302 ACTGAATCAACCCTATCTACTGCCTTGGCCGAGCTCGCCACCAGAAGCTTTGGCAGCTCC 
TGACTTAGTTGGGATAGATGACGGAACCGGCTCGAGCGGTGGTCTTCGAAACCGTCGAGG 

3332 SACI, 3346 HIND3, 

SerThrSerGlylleThrGlyAspAsnThrThrThrSerSerGluProAlaProSerGly 
3362 TCAACTTCCGGCATTACGGGCGACAATACGACAACATCCTCTGAGCCCGCCCCTTCTGGC 
AGTTGAAGGCCGTAATGCCCGCTGTTATGCTGTTGTAGGAGACTCGGGCGGGGAAGACCG 

CysProProAspSerAspAlaGluSerTyrSerSerMetProProLeuGluGlyGluPro 
3422 TGCCCCCCCGACTCCGACGCTGAGTCCTATTCCTCCATGCCCCCCCTGGAGGGGGAGCCT 
ACGGGGGGGCTGAGGCTGCGACTCAGGATAAGGAGGTACGGGGGGGACCTCCCCCTCGGA 

3437 EAM11051, 

GlyAspProAspLeuSerAspGlySerTrpSerThrValSerSerGluAlaAsnAlaGlu 
34 8 2 GGGGATCCGGATCTTAGCGACGGGTCATGGTCAACGGTCAGTAGTGAGGCCAACGCGGAG 

3484 BAMHX, 3485 BSAB1, 3487 BSPE1, 

AspValValCysCysSerMetSerTyrS xrTrpThrGlyAlaLeuValThrProCysAla 
3542 GATGTCGTGTGCTGCTCAATGTCTTACTCTTGGACAGGCGCACTCGTCACCCCGTGCGCC 
CTACAGCACACGACGAGTTACAGAATGAGAACCTGTCCGCGTGAGCAGTGGGGCACGCGG 
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3S89 DRA3, 3600 SAC 2, 

AlaGluGluGlnLysLeuProIleAsnAlaLeuSerAsnSerLeuLeuArgHisHisAsn 
3602 GCGGAAGAACAGAAACTGCCCATCAATGCACTAAGCAACTCGTTGCTACGTCACCACAAT 
CGCCTTCTTGTCTTTGACGGGTAGTTACGTGATTCGTTGAGCAACGATGCAGTGGTGTTA 

3611 ALWN1, 3655 PFLM1, 

LeuValTyrSerThrThrSerArgSerAlaCysGlnArgGlnLysLysValThrPheAsp 
3662 TTGGTGTATTCCACCACCTCACGCAGTGCTTGCCAAAGGCAGAAGAAAGTCACATTTGAC 
AACCACATAAGGTGGTGGAGTGCGTCACGAACGGTTTCCGTCTTCTTTCAGTGTAAACTG 

3681 DRA3, 

Ar gLe uG In Va 1 LeuAspS e r H i s T yrG InAspVa 1 Le uLy sG 1 uVa 1 Ly s Al aAl aAl a 
3722 AGACTGCAAGTTCTGGACAGCCATTACCAGGACGTACTCAAGGAGGTTAAAGCAGCGGCG 
TCTGACGTTCAAGACCTGTCGGTAATGGTCCTGCATGAGTTCCTCCAATTTCGTCGCCGC 

SerLysValLysAlaAsnLeuLeuSerValGluGluAlaCysSerLeuThrProProHis 
3782 TCAAAAGTGAAGGCTAACTTGCTATCCGTAGAGGAAGCTTGCAGCCTGACGCCCCCACAC 
AGTTTTCACTTCCGATTGAACGATAGGCATCTCCTTCGAACGTCGGACTGCGGGGGTGTG 

3816 HIND3, 

SerAlaLysSerLysPheGlyTyrGlyAlaLysAspValArgCysHisAlaArgLysAla 
38 42 TCAGCCAAATCCAAGTTTGGTTATGGGGCAAAAGACGTCCGTTGCCATGCCAGAAAGGCC 
AGTCGGTTTAGGTTCAAACCAATACCCCGTTTTCTGCAGGCAACGGTACGGTCTTTCCGG 

3875 AAT2, 3890 BGLI, 

ValThrHisIleAsnSerValTrpLysAspLeuLeuGluAspAsnValThrProIleAsp 
3902 GTAACCCACATCAACTCCGTGTGGAAAGACCTTCTGGAAGACAATGTAACACCAATAGAC 
CATTGGGTGTAGTTGAGGCACACCTTTCTGGAAGACCTTCTGTTACATTGTGGTTATCTG 

ThrThrlleMetAlaLysAsnGluValPheCysValGlnProGluLysGlyGlyArgLys 
3962 ACTACCATCATGGCTAAGAACGAGGTTTTCTGCGTTCAGCCTGAGAAGGGGGGTCGTAAG 
TGATGGTAGTACCGATTCTTGCTCCAAAAGACGCAAGTCGGACTCTTCCCCCCAGCATTC 

ProAlaArgLeuIleValPheProAspLeuGlyValArgValCysGluLysMetAlaLeu 
4022 CCAGCTCGTCTCATCGTGTTCCCCGATCTGGGCGTGCGCGTGTGCGAAAAGATGGCTTTG 
GGTCGAGCAGAGTAGCACAAGGGGCTAGACCCGCACGCGCACACGCTTTTCTACCGAAAC 

TyrAspValValThrLysLeuProLeuAlaValMetGlySerSerTyrGlyPheGlnTyr 
4 082 TACGACGTGGTTACAAAGCTCCCCTTGGCCGTGATGGGAAGCTCCTACGGATTCCAATAC 
ATGCTGCACCAATGTTTCGAGGGGAACCGGCACTACCCTTCGAGGATGCCTAAGGTTATG 

SerProGlyGlnArgValGluPheLeuValGlnAlaTrpLysSerLysLysThrProMet 
4142 TCACCAGG ACAGCGGGTTGAATTCCTCGTGCAAGCGTGGAAGTCCAAGAAAACCCCAATG 
AGTGGTCCTGTCGCCCAACTTAAGGAGCACGTTCGCACCTTCAGGTTCTTTTGGGGTTAC 

'. ,,,4160 ECORI, 

; ; GlyPheSerTyrAspThrArgCysPheAspSerThrValThrGluSerAspIleArgThr 

4202' GGGTTCTCGTATGATACCCGCTGCTTTGACTCCACAGTCACTGAGAGCGACATCCGTACG 
CCCAAGAGCATACTATGGGCGACGAAACTGAGGTGTCAGTGACTCTCGCTGTAGGCATGC 
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4229 DRD1, 4236 ALWN1, 

GluGluAlalleTyrGlnCysCysAspLeuAspProGlnAlaArgValAlaXleLysSer 
4262 GAGGAGGCAATCTACCAATGTTGTGACCTCGACCCCCAAGCCCGCGTGGCCATCAAGTCC 
CTCCTCCGTTAGATGGTTACAACACTGGAGCTGGGGGTTCGGGCGCACCGGTAGTTCAGG 

4301 BGLI, 4308 BALI, 

LeuThrGluArgLeuTyrValGlyGlyProLeuThrAsnSerArgGlyGluAsnCysGly 
4 322 CTCACCGAGAGGCTTTATGTTGGGGGCCCTCTTACCAATTCAAGGGGGGAGAACTGCGGC 
GAGTGGCTCTCCGAAATACAACCCCCGGGAGAATGGTTAAGTTCCCCCCTCTTGACGCCG 

4345 APAI, 

TyrArgArgCysArgAlaSerGlyValLeuThrThrSerCysGlyAsnThrLeuThrCys 
4 382 TATCGCAGGTGCCGCGCGAGCGGCGTACTGACAACTAGCTGTGGTAACACCCTCACTTGC 
ATAGCGTCCACGGCGCGCTCGCCGCATGACTGTTGATCGACACCATTGTGGGAGTGAACG 

TyrlleLysAlaArgAlaAlaCysArgAlaAlaGlyLeuGlnAspCysThrMetLeuVal 
4 4 4 2 TACATCAAGGCCCGGGCAGCCTGTCGAGCCGCAGGGCTCCAGGACTGCACCATGCTCGTG 
ATGTAGTTCCGGGCCCGTCGGACAGCTCGGCGTCCCGAGGTCCTGACGTGGTACGAGCAC 

4 4 52 SMAI XMAI, 

CysGlyAspAspLeuValVallleCysGluSerAlaGlyValGlnGluAspAlaAlaSer 
4 502 TGTGGCGACGACTTAGTCGTTATCTGTGAAAGCGCGGGGGTCCAGGAGGACGCGGCGAGC 
ACACCGCTGCTGAATCAGCAATAGACACTTTCGCGCCCCCAGGTCCTCCTGCGCCGCTCG 

4508 DRD1, 4511 TTH3I , 

LeuArgAlaPheThrGluAlaMetThrArgTyrSerAlaProProGlyAspProProGln 
4 562 CTGAGAGCCTTCACGGAGGCTATGACCAGGTACTCCGCCCCCCCTGGGGACCCCCCACAA 
GACTCTCGGAAGTGCCTCCGATACTGGTCCATGAGGCGGGGGGGACCCCTGGGGGGTGTT 

ProGluTyrAspLeuGluLeuIleThrSerCysSerSerAsnValSerValAlaHisAsp 
4 622 CCAGAATACGACTTGGAGCTCATAACATCATGCTCCTCCAACGTGTCAGTCGCCCACGAC 
GGTCTTATGCTGAACCTCGAGTATTGTAGTACGAGGAGGTTGCACAGTCAGCGGGTGCTG 

4 637 SACI, 

GlyAlaGlyLysArgValTyrTyrLeuThrArgAspProThrThrProLeuAlaArgAla 
4 682 GGCGCTGGAAAGAGGGTCTACTACCTCACCCGTGACCCTACAACCCCCCTCGCGAG AGCT 
CCGCGACCTTTCTCCCAGATGATGGAGTGGGCACTGGGATGTTGGGGGGAGCGCTCTCGA 

4731 NRUI, 

AlaTrpGluThrAlaArgHisThrProValAsnSerTrpLeuGlyAsnllelleMetPhe 
4 742. GCGTGGGAGACAGCAAGACACACTCCAGTCAATTCCTGGCTAGGCAACATAATCATGTTT 
"GGCACCCTCTGTCGTTCTGTGTGAGGTCAGTTAAGGACCGATCCGTTGTATTAGTACAAA 

AlaProThrLeuTrpAlaArgMetlleLeuMetThrHisPhePheSerValLeuIleAla 
4 8 02 GCCCCCACACTGTGGGCGAGGATGATACTGATGACCCATTTCTTTAGCGTCCTTATAGCC 
r CGGGGGTGTGACACCCGCTCCTACTATGACTACTGGGTAAAGAAATCGCAGGAATATCGG 

4806 PFLM1, 4807 DRA3, 

ArgAspGlnLeuGluGlnAlaLeuAspCysGluIleTyrGlyAlaCysTyrSerlleGlu 
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4862 AGGGACC^GCTTGAACAGGCCCTCGATTGCGAGATCTACGGGGCCTGCTACTCCAT AGAA 
TCCCTGGTCGAACTTGTCCGGGAGCTAACGCTCTAGATGCCCCGGACGATGAGGTATCTT 

4893 BGL2, 

ProLeuAspLeuProProIlelleGlnArgLeuHisGlyLeuSerAlaPheSerLeuHis 
4 922 CCACTGGATCTACCTCCAATCATTCAAAGACTCCATGGCCTCAGCGCATTTTCACTCCAC 
GGTGACCTAGATGGAGGTTAGTAAGTTTCTGAGGTACCGGAGTCGCGTAAAAGTGAGGTG 

4 954 NCOI, 

SerTyrSerProGlyGluIleAsnArgValAlaAlaCysLeuArgLysLeuGlyValPro 
4 982 AGTTACTCTCCAGGTGAAATCAATAGGGTGGCCGCATGCCTCAGAAAACTTGGGGTACCG 
TCAATGAGAGGTCCACTTTAGTTATCCCACCGGCGTACGGAGTCTTTTGAACCCCATGGC 

5015 SPHI, 5035 KPNI, 

ProLeuArgAlaTrpArgHisArgAlaArgSerValArgAlaArgLeuLeuAlaArgGly 
5042 CCCTTGCGAGCTTGGAGACACCGGGCCCGGAGCGTCCGCGCTAGGCTTCTGGCCAGAGGA 
GGGAACGCTCGAACCTCTGTGGCCCGGGCCTCGCAGGCGCGATCCGAAGACCGGTCTCCT 

5064 APAI, 5091 BALI, 

GlyArgAlaAlalleCysGlyLysTyrLeuPheAsnTrpAlaValArgThrLysLeuLys 
5102 GGCAGGGCTGCCATATGTGGCAAGTACCTCTTCAACTGGGCAGTAAGAACAAAGCTCAAA 
CCGTCCCGACGGTATACACCGTTCATGGAGAAGTTGACCCGTCATTCTTGTTTCGAGTTT 

5113 NOEI , 

LeuThrProIleAlaAlaAlaGlyGlnLeuAspLeuSerGlyTrpPheThrAlaGlyTyr 
5162 CTCACTCCAATAGCGGCCGCTGGCCAGCTGGACTTGTCCGGCTGGTTCACGGCTGGCTAC 
GAGTGAGGTTATCGCCGGCGACCGGTCGACCTGAACAGGCCGACCAAGTGCCGACCGATG 

5174 NOTI, 5175 EAG1 XMA3, 5182 BALI, 5186 PVU2, 

SerGlyGlyAspIleTyrHisSerValSerHisAlaArgProArgTrpIleTrpPheCys 
5222 AGCGGGGGAGACATTTATCACAGCGTGTCTCATGCCCGGCCCCGCTGGATCTGGTTTTGC 
TCGCCCCCTCTGTAAATAGTGTCGCACAGAGTACGGGCCGGGGCGACCTAGACCAAAACG 

5240 DRA3, 

LeuLeuLeuLeuAlaAlaGlyValGlylleTyrLeuLeuProAsnArgMetSerThrAsn 
5282 CTACTCCTGCTTGCTGCAGGGGTAGGCATCTACCTCCTCCCCAACCGAATGAGCACGAAT 
GATGAGGACGAACGACGTCCCCATCCGTAGATGGAGGAGGGGTTGGCTTACTCGTGCTTA 

5295 PSTI, 

ProLysProGlnArgLysThrLysArgAsnThrAsnArgArgProGlnAspValLysPhe 
,5342 CCTAAACCTCAAAGAAAGACCAAACGTAACACCAACCGGCGGCCGCAGGACGTCAAGTTC 
GGATTTGGAGTTTCTTTCTGGTTTGCATTGTGGTTGGCCGCCGGCGTCCTGCAGTTCAAG 

5380 NOTI, 5381 EAG1 XMA3, 5390 AAT2, 5401 SMAI XMAI, 

ProGlyGlyGlyGlnlleValGlyGlyValTyrLeuLeuProArgArgGlyProArgLeu 
5402 CCGGGTGGCGGTCAGATCGTTGGTGGAGTTTACTTGTTGCCGCGCAGGGGCCCTAGATTG 
GGCCCACCGCCAGTCTAGCAACCACCTCAAATGAACAACGGCGCGTCCCCGGGATCTAAC 
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5449 APAI, 

GlyValArgAlaThrArgLysThrSerGluArgSerGlnProArgGlyArgArgGlnPro 
54 62 GGTGTGCGCGCGACGAGAAAGACTTCCGAGCGGTCGCAACCTCGAGGTAGACGTCAGCCT 
CCACACGCGCGCTGCTCTTTCTGAAGGCTCGCCAGCGTTGGAGCTCCATCTGCAGTCGGA 

5467 BSSH2 , 5478 XMNI, 5502 XHOI, 5511 AAT2, 

IleProLysAlaArgArgProGluGlyArgThrTrpAlaGlnProGlyTyrProTrpPro 
5522 ATCCCCAAGGCTCGTCGGCCCGAGGGCAGGACCTGGGCTCAGCCCGGGTACCCTTGGCCC 
TAGGGGTTCCGAGCAGCCGGGCTCCCGTCCTGGACCCGAGTCGGGCCCATGGGAACCGGG 

5548 ALWN1, 5558 ESP1, 5564 SMAI XMAI, 5568 KPNI, 

LeuTyrGlyAsnGluGlyCysGlyTrpAlaGlyTrpLeuLeuSerProArgGlySerArg 
5582 CTCTATGGCAATGAGGGCTGCGGGTGGGCGGGATGGCTCCTGTCTCCCCGTGGCTCTCGG 
GAGATACCGTTACTCCCGACGCCCACCCGCCCTACCGAGGACAGAGGGGCACCGAGAGCC 

ProSerTrpGlyProThrAspProArgArgArgSerArgAsnLeuGlyLysVallleAsp 
564 2 CCTAGCTGGGGCCCCACAGACCCCCGGCGTAGGTCGCGCAATTTGGGTAAGGTCATCGAT 
GGATCGACCCCGGGGTGTCTGGGGGCCGCATCCAGCGCGTTAAACCCATTCCAGTAGCTA 

5650 APAI, 5696 CLAI, 

ThrLeuThrCysGlyPheAlaAspLeuMetGlyTyrlleProLeuValOC AM 
5702 ACCCTTACGTGCGGCTTCGCCGACCTCATGGGGTACATACCGCTCGTCT AAT AGTCGAC 
TGGGAATGCACGCCGAAGCGGCTGGAGTACCCCATGTATGGCGAGCAGATTATCAGCTG 

5724 HGIE2, 5755 SALI, 
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MetAlaAlaTyrAlaAlaGlnGlyTyrLysValLeuValLeuAsn 
2 AGCTTACAAAACAAAATGGCTGCATATGCAGCTCAGGGCTATAAGGTGCTAGTACTCAAC 
TCGAATGTTTTGTTTTACCGACGTATACGTCGAGTCCCGATATTCCACGATCATGAGTTG 

1 HIND3 , 24 NDEI , 52 SCAI , 

ProSerValAlaAlaThrLeuGlyPheGlyAlaTyrMetSerLysAlaHisGlylleAsp 
62 CCCTCTGTTGCTGCAACACTGGGCTTTGGTGCTTACATGTCCAAGGCTCATGGGATCGAT 
GGGAGACAACGACGTTGTGACCCGAAACCACGAATGTACAGGTTCCGAGTACCCTAGCTA 

116 CLAI , 

ProAsnlleArgThrGlyValArgThrlleThrThrGlySerProIleThrTyrSerThr 
122 CCTAACATCAGGACCGGGGTGAGAACAATTACCACTGGCAGCCCCATCACGTACTCCACC 
GGATTGTAGTCCTGGCCCCACTCTTGTTAATGGTGACCGTCGGGGTAGTGCATGAGGTGG 

TyrGlyLysPheLeuAlaAspGlyGlyCysSerGlyGlyAlaTyrAspIlellelleCys 
182 TACGGCAAGTTCCTTGCCGACGGCGGGTGCTCGGGGGGCGCTTATGACATAATAATTTGT 
ATGCCGTTCAAGGAACGGCTGCCGCCCACGAGCCCCCCGCGAATACTGTATTATTAAACA 

AspGluCysHisSerThrAspAlaThrSerlleLeuGlylleGlyThrValLeuAspGln 
24 2 GACGAGTGCCACTCCACGGATGCCACATCCATCTTGGGCATTGGCACTGTCCTTGACCAA 
CTGCTCACGGTGAGGTGCCTACGGTGTAGGTAGAACCCGTAACCGTGACAGGAACTGGTT 

AlaGluThrAlaGlyAlaArgLeuValValLeuAlaThrAlaThrProProGlySerVal 
302 GCAGAGACTGCGGGGGCGAGACTGGTTGTGCTCGCCACCGCCACCCCTCCGGGCTCCGTC 
CGTCTCTGACGCCCCCGCTCTGACCAACACGAGCGGTGGCGGTGGGGAGGCCCGAGGCAG 

303 ALWN1, 

ThrValProHisProAsnlleGluGluValAlaLeuSerThrThrGlyGluIleProPhe 
362 ACTGTGCCCCATCCCAACATCGAGGAGGTTGCTCTGTCCACCACCGGAGAGATCCCTTTT 
TGACACGGGGTAGGGTTGTAGCTCCTCCAACGAGACAGGTGGTGGCCTCTCTAGGGAAAA 

TyrGlyLysAlalleProLeuGluVallleLysGlyGlyArgHisLeuIlePheCysHis 
422 TACGGCAAGGCTATCCCCCTCGAAGT AATCAAGGGGGGGAGACATCTCATCTTCTGTCAT 
ATGCCGTTCCGATAGGGGGAGCTTCATTAGTTCCCCCCCTCTGTAGAGTAGAAGACAGTA 
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SerLysLysLysCysAspGluLeuAlaAl&LysLeuValJUaLeuGlylleAsnAlaVal 
482 TCAAAGAAGAAGTGCGACGAACTCGCCGCAAAGCTGGTCGCATTGGGCATCAATGCCGTG 
AGTTTCTTCTTCACGCTGCTTGAGCGGCGTTTCGACCAGCGTAACCCGTAGTTACGGCAC 

AlaTyrTyrArgGlyLeuAspValSerValXleProThrSerGlyAspValValValVal 
542 GCCTACTACCGCGGTCTTGACGTGTCCGTCATCCCGACCAGCGGCGATGTTGTCGTCGTG 
CGGATGATGGCGCCAGAACTGCACAGGCAGTAGGGCTGGTCGCCGCTACAACAGCAGCAC 

550 SAC2, 560 DRD1, 

AlaThrAspAlaLeuMetThrGlyTyrThrGlyAspPheAspSerVallleAspCysAsn 
602 GCAACCGATGCCCTCATGACCGGCTATACCGGCGACTTCGACTCGGTGATAGACTGCAAT 
CGTTGGCTACGGGAGTACTGGCCGATATGGCCGCTGAAGCTGAGCCACTATCTGACGTTA 

615 BSPH1, 

ThrCysValThrGlnThrValAspPheSerLeuAspProThrPheThrlleGluThrlle 
662 ACGTGTGTCACCCAGACAGTCGATTTCAGCCTTGACCCTACCTTCACCATTGAGACAATC 
TGCACACAGTGGGTCTGTCAGCTAAAGTCGGAACTGGGATGGAAGTGGTAACTCTGTTAG 

ThrLeuProGlnAspAlaValSerArgThrGlnArgArgGlyArgThrGlyArgGlyLys 
722 ACGCTCCCCCAAGATGCTGTCTCCCGCACTCAACGTCGGGGCAGGACTGGCAGGGGGAAG 
TGCGAGGGGGTTCTACGACAGAGGGCGTGAGTTGCAGCCCCGTCCTGACCGTCCCCCTTC 

ProGlylleTyrArgPheValAlaProGlyGluArgProSerGlyMetPheAspSerSer 
782 CCAGGCATCTACAGATTTGTGGCACCGGGGGAGCGCCCCTCCGGCATGTTCGACTCGTCC 
GGTCCGTAGATGTCTAAACACCGTGGCCCCCTCGCGGGGAGGCCGTACAAGCTGAGCAGG 

816 BGLI, 833 DRD1, 

ValLeuCysGluCysTyrAspAlaGlyCysAlaTrpTyrGluLeuThrProAlaGluThr 
84 2 GTCCTCTGTGAGTGCTATGACGCAGGCTGTGCTTGGTATGAGCTCACGCCCGCCGAGACT 
CAGGAGACACTCACGATACTGCGTCCGACACGAACCATACTCGAGTGCGGGCGGCTCTGA 

881 SACI, 

ThrValArgLeuArgAlaTyrMetAsnThrProGlyLeuProValCysGlnAspHisLeu 
902 ACAGTTAGGCTACGAGCGTACATGAACACCCCGGGGCTTCCCGTGTGCCAGGACCATCTT 
TGTCAATCCGATGCTCGCATGTACTTGTGGGGCCCCGAAGGGCACACGGTCCTGGTAGAA 

931 SMAI XMAI, 

GluPheTrpGluGlyValPheThrGlyLeuThrHisIleAspAlaHisPheLeuSerGln 
962 GAATTTTGGGAGGGCGTCTTTACAGGCCTCACTCATATAGATGCCCACTTTCTATCCCAG 
CTTAAAACCCTCCCGCAGAAATGTCCGGAGTGAGTATATCTACGGGTGAAAGATAGGGTC 

985 STUI, 

ThrLysGlnSerGlyGluAsnLeuProTyrLeuValAlaTyrGlnAlaThrValCysAla 
1022 ACAAAGCAGAGTGGGGAGAACCTTCCTTACCTGGTAGCGTACCAAGCCACCGTGTGCGCT 
TGTTTCGTCTCACCCCTCTTGGAAGGAATGGACCATCGCATGGTTCGGTGGCACACGCGA 

1069 DRA3, 

ArgAlaGlnAlaProProProSerTrpAspGlnMetTrpLysCysLeuIleArgLeuLys 
1082 AGGGCTCAAGCCCCTCCCCCATCGTGGGACCAGATGTGGAAGTGTTTGATTCGCCTCAAG 
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TCCCGAGTTCGGGGAGGGGGTAGCACCCTGGTCTACACCTTCACAAACTAAGCGGAGTTC 

ProThrLeuHisGlyProThrProL uLeuTyrArgLeuGlyAlaValGlnAsnGluXl 
1142 CCCACCCTCCATGGGCCAACACCCCTGCTATACAGACTGGGCGCTGTTCAGAATGAAATC 
GGGTGGGAGGTACCCGGTTGTGGGGACGATATGTCTGACCCGCGACAAGTCTTACTTTAG 

1150 NCOX, 

ThrLeuThrHisProValThrLysTyrlleMetThrCysMetSerAlaAspLeuGluVal 
1202 ACCCTGACGCACCCAGTCACCAAATACATCATGACATGCATGTCGGCCGACCTGGAGGTC 
TGGGACTGCGTGGGTCAGTGGTTTATGTAGTACTGTACGTACAGCCGGCTGGACCTCCAG 

1230 BSPH1, 1234 DRD1, 1237 AVA3, 1245 EAG1 XMA3, 1250 DRD1 , 



ValThrSerThrTrpValLeuValGlyGlyValLeuAlaAlaLeuAlaAlaTyrCysLeu 
1262 GTCACGAGCACCTGGGTGCTCGTTGGCGGCGTCCTGGCTGCTTTGGCCGCGTATTGCCTG 
CAGTGCTCGTGGACCCACGAGCAACCGCCGCAGGACCGACGAAACCGGCGCATAACGGAC 

SerThrGlyCysValVallleValGlyArgyalValLeuSerGlyLysProAiallelle 
1322 TCAACAGGCTGCGTGGTCATAGTGGGCAGGGTCGTCTTGTCCGGGAAGCCGGCAATCATA 
AGTTGTCCGACGCACCAGTATCACCCGTCCCAGCAGAACAGGCCCTTCGGCCGTTAGTAT 

1369 NAEI, 

ProAspArgGluValLeuTyrArgGluPheAspGluMetGluGluCysSerGlnHisLeu 
1382 CCTGACAGGGAAGTCCTCTACCGAGAGTTCGATGAGATGGAAGAGTGCTCTCAGCACTTA 
GGACTGTCCCTTCAGGAGATGGCTCTCAAGCTACTCTACCTTCTCACGAGAGTCGTGAAT 

138 5 DRD1, 

ProTyrlleGluGlnGlyMetMetLeuAlaGluGlnPheLysGlnLysAlaLeuGlyLeu 
14 4 2 CCGTACATCGAGCAAGGGATGATGCTCGCCGAGCAGTTCAAGCAGAAGGCCCTCGGCCTC 
GGCATGTAGCTCGTTCCCTACTACGAGCGGCTCGTCAAGTTCGTCTTCCGGGAGCCGGAG 

LeuGlnThrAlaSerArgGlnAlaGluVallleAlaProAlaValGlnThrAsnTrpGln 
1502 CTGCAGACCGCGTCCCGTCAGGCAGAGGTTATCGCCCCTGCTGTCCAGACCAACTGGCAA 
GACGTCTGGCGCAGGGCAGTCCGTCTCCAATAGCGGGGACGACAGGTCTGGTTGACCGTT 

1502 PSTX, 1507 TTH3X, 

LysLeuGluThrPheTrpAlaLysHisMetTrpAsnPhelleSerGlylleGlnTyrLeu 
1562 AAACTCGAGACCTTCTGGGCGAAGCATATGTGGAACTTCATCAGTGGGATACAATACTTG 
TTTGAGCTCTGGAAGACCCGCTTCGTATACACCTTGAAGTAGTCACCCTATGTTATGAAC 

-1565 XHOI, 1586 NDEI, 

AlaGlyLsuSerThrLeuProGlyAsnProAlaXleAlaSerLeuMetAlaPheThrAla 
1622 GCGGGCTTGTCAACGCTGCCTGGTAACCCCGCCATTGCTTCATTGATGGCTTTTACAGCT 
CGCCCGAACAGTTGCGACGGACCATTGGGGCGGTAACGAAGTAACTACCGAAAATGTCGA 

1643 BSTE2, 1677 ALWN1 PVU2, 

AlaValThrSerProLeuThrThrSerGlnThrLeuLeuPh AsnlleLeuGlyGlyTrp 
1682 GCTGTCACCAGCCCACTAACCACTAGCCAAACCCTCCTCTTCAACATATTGGGGGGGTGG 
CGACAGTGGTCGGGTGATTGGTGATCGGTTTGGGAGGAGAAGTTGTATAACCCCCCCACC 
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ValAlaAlaGlnLeuAlaAlaProGlyAlaAlaThrAlaPheValGlyAlaGlyLeuAla 
1742 GTGGCTGCCCAGCTCGCCGCCCCCGGTGCCGCTACTGCCTTTGTGGGCGCTGGCTTAGCT 
CACCGACGGGTCGAGCGGCGGGGGCCACGGCGATGACGGAAACACCCGCGACCGAATCGA 

1794 ESP1, 

Gl yAl aAl a 1 1 eGl ySe rVa 1G1 yLeuG 1 yLysVal Leul leAsp 1 1 e LeuAlaG 1 yTyr 
1802 GGCGCCGCCATCGGCAGTGTTGGACTGGGGAAGGTCCTCATAGACATCCTTGCAGGGTAT 
CCGCGGCGGTAGCCGTCACAACCTGACCCCTTCCAGGAGTATCTGTAGGAACGTCCCATA 

1802 KAS1 NARI, 

GlyAlaGlyValAlaGlyAlaLeuValAlaPheLysIleMetSerGlyGluValProSer 
18 62 GGCGCGGGCGTGGCGGGAGCTCTTGTGGCATTCAAGATCATGAGCGGTGAGGTCCCCTCC 
CCGCGCCCGCACCGCCCTCGAGAACACCGTAAGTTCTAGTACTCGCCACTCCAGGGGAGG 

1878 SACI, 1899 BSPH1, 

ThrGluAspLeuValAsnLeuLeuProAlalleLeuSerProGlyAlaLeuValValGly 
1922 ACGGAGGACCTGGTCAATCTACTGCCGGCCATCCTCTCGCCCGGAGCCCTCGTAGTCGGC 
TGCCTCCTGGACCAGTTAGATGACGGGCGGTAGGAGAGCGGGCCTCGGGAGCATCAGCCG 

1928 TTH3I, 

Va 1 Va ICy s Al aAl a 1 1 eLeuAr g Ar gH i s Va 1G1 y ProGl yGl uGl yAl a Va 1G 1 nT rp 
1982 GTGGTCTGTGCAGCAATACTGCGCCGGCACGTTGGCCCGGGCGAGGGGGCAGTGCAGTGG 
CACCAGACACGTCGTTATGACGCGGCCGTGCAACCGGGCCCGCTCCCCCGTCACGTCACC 

2004 NAEI, 2017 SMAI XMAI, 

MetAsnArgLeuIleAlaPheAlaSerArgGlyAsnHisValSerProThrHisTyrVal 
2042 ATGAACCGGCTGATAGCCTTCGCCTCCCGGGGGAACCATGTTTCCCCCACGCACTACGTG 
TACTTGGCCGACTATCGGAAGCGGAGGGCCCCCTTGGTACAAAGGGGGTGCGTGATGCAC 

2067 SMAI XMAI, 2093 DRA3, 

ProGluSerAspAlaAlaAlaArgValThrAlalleLeuSerSerLeuThrValThrGln 
2102 CCGGAGAGCGATGCAGCTGCCCGCGTCACTGCCATACTCAGCAGCCTCACTGTAACCCAG 
GGCCTCTCGCTACGTCGACGGGCGCAGTGACGGTATGAGTCGTCGGAGTGACATTGGGTC 

2115 PVU2, 2159 ALWN1 , 

LeuLeuArgArgLeuHisGlnTrpIleSerSerGluCysThrThrProCysSerGlySer 
2162 CTCCTGAGGCGACTGCACCAGTGGATAAGCTCGGAGTGTACCACTCCATGCTCCGGTTCC 
GAGGACTCCGCTGACGTGGTCACCTATTCGAGCCTCACATGGTGAGGTACGAGGCCAAGG 

2164 MST2, 2220 ECON1, 

TrpLeuArgAspIleTrpAspTrpIleCysGluValLeuSerAspPheLysThrTrpLeu 
2222 TGGCTAAGGGACATCTGGGACTGGATATGCGAGGTGTTGAGCGACTTTAAGACCTGGCTA 
ACCGATTCCCTGTAGACCCTGACCTATACGCTCCACAACTCGCTGAAATTCTGGACCGAT 

LysAlaLysLeuM tProGlnLeuProGlylleProPheValSerCysGlnArgGlyTyr 

TTTCGATTCGAGTACGGTGTCGACGGACCCTAGGGGAAACACAGGACGGTCGCGCCCATA 

2285 ESPl r 2300 PVU2, 2310 BAMHI , 
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LysGlyValTrpArgGlyAspGlylleMetHisThrArgCysHisCysGlyAlaGluIle 
2342 AAGGGGGTCTGGCGAGGGGACGGCATCATGCACACTCGCTGCCACTGTGGAGCTGAGATC 
TTCCCCCAGACCGCTCCCCTGCCGTAGTACGTGTGAGCGACGGTGACACCTCGACTCTAG 

ThrGlyHisValLysAsnGlyThrMetArglleValGlyProArgThrCysArgAsnMet 
2402 ACTGGACATGTCAAAAACGGGACGATGAGGATCGTCGGTCCTAGGACCTGCAGGAACATG 
TGACCTGTACAGTTTTTGCCCTGCTACTCCTAGCAGCCAGGATCCTGGACGTCCTTGTAC 

2425 BSAB1, 2441 AVR2, 2448 SSE83871, 2449 PSTI, 

TrpSerGlyThrPheProIleAsnAlaTyrThrThrGlyProCysThrProLeuProAla 
24 62 TGGAGTGGGACCTTCCCCATTAATGCCTACACCACGGGCCCCTGTACCCCCCTTCCTGCG 
ACCTCACCCTGGAAGGGGTAATTACGGATGTGGTGCCCGGGGACATGGGGGGAAGGACGC 

24 80 ASE1, 24 97 APAI, 

ProAsnTyrThrPheAlaLeuTrpArgValSerAlaGluGluTyrValGluIleArgGln 
2522 CCGAACTACACGTTCGCGCTATGGAGGGTGTCTGCAGAGGAATACGTGGAGATAAGGCAG 
GGCTTGATGTGCAAGCGCGATACCTCCCACAGACGTCTCCTTATGCACCTCTATTCCGTC 

2553 PSTI, 

ValGlyAspPheHisTyrValThrGlyMetThrThrAspAsnLeuLysCysProCysGln 
2582 GTGGGGGACTTCCACTACGTGACGGGTATGACTACTGACAATCTTAAATGCCCGTGCCAG 
CACCCCCTGAAGGTGATGCACTGCCCATACTGATGACTGTTAGAATTTACGGGCACGGTC 

2594 DRA3, 

ValProSerProGluPhePheThrGluLeuAspGlyValArgLeuHisArgPheAlaPro 
2 642 GTCCCATCGCCCGAATTTTTCACAGAATTGGACGGGGTGCGCCTACATAGGTTTGCGCCC 
CAGGGTAGCGGGCTTAAAAAGTGTCTTAACCTGCCCCACGCGGATGTATCCAAACGCGGG 

ProCysLysProLeuLeuArgGluGluValSerPheArgValGlyLeuHisGluTyrPro 
2702 CCCTGCAAGCCCTTGCTGCGGGAGGAGGTATCATTCAGAGTAGGACTCCACGAATACCCG 
GGGACGTTCGGGAACGACGCCCTCCTCCATAGTAAGTCTCATCCTGAGGTGCTTATGGGC 

2757 HGIE2, 

ValGlySerGlnLeuProCysGluProGluProAspValAlaValLeuThrSerMetLeu 
27 62 GTAGGGTCGCAATTACCTTGCGAGCCCGAACCGGACGTGGCCGTGTTGACGTCCATGCTC 
CATCCCAGCGTTAATGGAACGCTCGGGCTTGGCCTGCACCGGCACAACTGCAGGTACGAG 

2809 AAT2, 

ThrAspProSerHisIleThrAlaGluAlaAlaGlyArgArgLeuAlaArgGlySerPro 
2822 ACTGATCCCTCCCATATAACAGCAGAGGCGGCCGGGCGAAGGTTGGCGAGGGGATCACCC 
TGACTAGGGAGGGTATATTGTCGTCTCCGCCGGCCCGCTTCCAACCGCTCCCCTAGTGGG 

2850, EAG1 XMA3, 

ProSerValAlaSerSerSerAlaSerGlnL uSerAlaProSerLeuLysAlaThrCys 
288 2 CCCTCTGTGGCCAGCTCCTCGGCTAGCCAGCTATCCGCTCCATCTCTCAAGGCAACTTGC 
GGGAGACACCGGTCGAGGAGCCGATCGGTCGATAGGCGAGGTAGAGAGTTCCGTTGAACG 

2889 BALI, 2903 NHEI, 
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ThrAlaAsnHisAspSerProAspAlaGluLeuIleGluAlaAsnLeuLeuTrpAirgGln 
2942 ACCGCTAACCATGACTCCCCTGATGCTGAGCTCATAGAGGCCAACCTCCTATGGAGGCAG 
TGGCGATTGGTACTGAGGGGACTACGACTCGAGTATCTCCGGTTGGAGGATACCTCCGTC 

2966 ESP1, 2969 SACI, 

GluMetGlyGlyAsnlleThrArgValGluSerGluAsnLysValVallleLeuAspSer 
3002 GAGATGGGCGGCAACATCACCAGGGTTGAGTCAGAAAACAAAGTGGTGATTCTGGACTCC 
CTCTACCCGCCGTTGTAGTGGTCCCAACTCAGTCTTTTGTTTCACCACTAAGACCTGAGG 

PheAspProLeuValAlaGluGluAspGluArgGluIleSerValProAlaGluIleLeu 
3062 TTCGATCCGCTTGTGGCGGAGGAGGACGAGCGGGAGATCTCCGTACCCGCAGAAATCCTG 
AAGCTAGGCGAACACCGCCTCCTCCTGCTCGCCCTCTAGAGGCATGGGCGTCTTTAGGAC 

3096 BGL2, 

ArgLysSerArgArgPheAlaGlnAlaLeuProValTrpAlaArgProAspTyrAsnPro 
3122 CGGAAGTCTCGGAGATTCGCCCAGGCCCTGCCCGTTTGGGCGCGGCCGGACTATAACCCC 
GCGTTCAGAGCCTCTAAGCGGGTCCGGGACGGGCAAACCCGCGCCGGCCTGATATTGGGG 

3143 ALWN1, 3164 EAG1 XMA3, 

ProLeuValGluThrTrpLysLysProAspTyrGluProProValValHisGlyCysPro 
3182 CCGCTAGTGGAGACGTGGAAAAAGCCCGACTACGAACCACCTGTGGTCCATGGCTGCCCG 
GGCGATCACCTCTGCACCTTTTTCGGGCTGATGCTTGGTGGACACCAGGTACCGACGGGC 

3217 HGIE2, 3229 NCOI, 

LeuProProProLysSerProProValProProProArgLysLysArgThrValValLeu 
32 4 2 CTTCCACCTCCAAAGTCCCCTCCTGTGCCTCCGCCTCGGAAGAAGCGGACGGTGGTCCTC 
GAAGGTGGAGGTTTCAGGGGAGGACACGGAGGCGGAGCCTTCTTCGCCTGCCACCAGGAG 

ThrGluSerThrLeuSerThrAlaLeuAlaGluLeuAlaThrArgSerPheGlySerSer 
3302 ACTGAATCAACCCTATCTACTGCCTTGGCCGAGCTCGCCACCAGAAGCTTTGGCAGCTCC 
TGACTTAGTTGGGATAGATGACGGAACCGGCTCGAGCGGTGGTCTTCGAAACCGTCGAGG 

3332 SACI, 3346 KIND3, 

SerThrSerGlylleThrGlyAspAsnThrThrThrSerSerGluProAlaProSerGly 
3362 TCAACTTCCGGCATTACGGGCGACAATACGACAACATCCTCTGAGCCCGCCCCTTCTGGC 
AGTTGAAGGCCGTAATGCCCGCTGTTATGCTGTTGTAGGAGACTCGGGCGGGGAAGACCG 

CysProProAspSerAspAlaGluSerTyrSerSerMetProProLeuGluGlyGluPro 
3422 TGCCCCCCCGACTCCGACGCTGAGTCCTATTCCTCCATGCCCCCCCTGGAGGGGGAGCCT 
ACGGGGGGGCTGAGGCTGCGACTCAGGATAAGGAGGTACGGGGGGGACCTCCCCCTCGGA 

3437 EAM11051, 

GlyAspProAspLeuSerAspGlySerTrpSerThrValSerSerGluAlaAsnAlaGlu 
34 82 GGGGATCCGGATCTTAGCGACGGGTCATGGTCAACGGTCAGTAGTGAGGCCAACGCGGAG 
CCCCTAGGCCTAGAATCGCTGCCCAGTACCAGTTGCCAGTCATCACTCCGGTTGCGCCTC 

3484 BAMHI , 3485 BSAB1, 3487 BSPE1, 

AspValValCysCysSerMetSerTyrSerTrpThrGlyAlaLeuValThrProCysAla 
3542 GATGTCGTGTGCTGCTCAATGTCTTACTCTTGGACAGGCGCACTCGTCACCCCGTGCGCC 
CTACAGCACACGACGAGTTACAGAATGAGAACCTGTCCGCGTGAGCAGTGGGGCACGCGG 
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3589 DRA3, 3600 SAC2, 

AlaGluGluGlnLysLeuProIleAsnAlaLeuSerAsnSerLeuLeuArgHisHisAsn 
3602 GCGGAAGAACAGAAACTGCCCATCAATGCACTAAGCAACTCGTTGCTACGTCACCACAAT 
CGCCTTCTTGTCTTTGACGGGTAGTTACGTGATTCGTTGAGCAACGATGCAGTGGTGTTA 

3611 ALWN1, 3655 PFLM1, 

LeuValTyrSerThrThrSerArgSerAlaCysGlnArgGlnLysLysValThrPheAsp 
3662 TTGGTGTATTCCACCACCTCACGCAGTGCTTGCCAAAGGCAGAAGAAAGTCACATTTGAC 
AACCACATAAGGTGGTGGAGTGCGTCACGAACGGTTTCCGTCTTCTTTCAGTGTAAACTG 

3681 DRA3, 

ArgLeuGlnValLeuAspSerHisTyrGlnAspValLeuLysGluValLysAlaAlaAla 
3722 AGACTGCAAGTTCTGGACAGCCATTACCAGGACGTACTCAAGGAGGTTAAAGCAGCGGCG 
TCTGACGTTCAAGACCTGTCGGTAATGGTCCTGCATGAGTTCCTCCAATTTCGTCGCCGC 

SerLysValLvsAlaAsnLeuLeuSerValGliiGluAlaCysSsrl-euThrFroProKis 
37 82 TCAAAAGTGAAGGCTAACTTGCTATCCGTAGAGGAAGCTTGCAGCCTGACGCCCCCACAC 
AGTTTTCACTTCCGATTGAACGATAGGCATCTCCTTCGAACGTCGGACTGCGGGGGTGTG 

3816 HIND3, 

SerAlaLysSerLysPheGlyTyrGlyAlaLysAspValArgCysHisAlaArgLysAla 
3842 TCAGCCAAATCCAAGTTTGGTTATGGGGCAAAAGACGTCCGTTGCCATGCCAGAAAGGCC 
AGTCGGTTTAGGTTCAAACCAATACCCCGTTTTCTGCAGGCAACGGTACGGTCTTTCCGG 

3875 AAT2, 3890 BGLI, 

ValThrHisIleAsnSerValTrpLysAspLeuLeuGluAspAsnValThrProIleAsp 
3902 GTAACCCACATCAACTCCGTGTGGAAAGACCTTCTGGAAGACAATGTAACACCAATAGAC 
CATTGGGTGTAGTTGAGGCACACCTTTCTGGAAGACCTTCTGTTACATTGTGGTTATCTG 

ThrThrlleMetAlaLysAsnGluValPheCysValGlnProGluLysGlyGlyArgLys 
3962 ACTACCATCATGGCTAAGAACGAGGTTTTCTGCGTTCAGCCTGAGAAGGGGGGTCGTAAG 
TGATGGTAGTACCGATTCTTGCTCCAAAAGACGCAAGTCGGACTCTTCCCCCCAGCATTC 

ProAlaArgLeuIleValPheProAspLeuGlyValArgValCysGluLysMetAlaLeu 
4 022 CCAGCTCGTCTCATCGTGTTCCCCGATCTGGGCGTGCGCGTGTGCGAAAAGATGGCTTTG 
GGTCGAGCAGAGTAGCACAAGGGGCTAGACCCGCACGCGCACACGCTTTTCTACCGAAAC 

TyrAspValValThrLysLeuProLeuAlaValMetGlySerSerTyrGlyPheGlnTyr 
4 082 TACGACGTGGTTACAAAGCTCCCCTTGGCCGTGATGGGAAGCTCCTACGGATTCCAATAC 
ATGCTGCACCAATGTTTCGAGGGGAACCGGCACTACCCTTCGAGGATGCCTAAGGTTATG 

SerProGlyGlnArgValGluPheLeuValGlnAlaTrpLysSerLysLysThrProMet 
4142 TCACCAGGACAGCGGGTTGAATTCCTCGTGCAAGCGTGGAAGTCCAAGAAAACCCCAATG 
, AGTGGTCCTGTCGCCCAACTTAAGGAGCACGTTCGCACCTTCAGGTTCTTTTGGGGTTAC 

4f60 ECORI, 

GlyPheSerTyrAspThrArgCysPheAspSerThrValThrGluSerAspIleArgThr 
4 2 02 GGGTTCTCGTATGATACCCGCTGCTTTGACTCCACAGTCACTGAGAGCGACATCCGTACG 
CCCAAGAGCATACTATGGGCGACGAAACTGAGGTGTCAGTGACTCTCGCTGTAGGCATGC 
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4229 DRD1, €236 ALTOJ1, 

GluGluAlaXleTyrGlnCysCysAspLeuAspProGlnAlaArgValAlaXleLysSer 
4262 GAGGAGGCAATCTACCAATGTTGTGACCTCGACCCCCAAGCCCGCGTGGCCATCAAGT.ee 
CTCCTCCGTTAGATGGTTACAACACTGGAGCTGGGGGTTCGGGCGCACCGGTAGTTCAGG 

4301 BGLI, 4308 BALI, 

LeuThrGluArgLeuTyrValGlyGlyProLeuThrAsnSerArgGlyGluAsnCysGly 
4 322 CTCACCGAGAGGCTTTATGTTGGGGGCCCTCTTACCAATTCAAGGGGGGAGAACTGCGGC 
GAGTGGCTCTCCGAAATACAACCCCCGGGAGAATGGTTAAGTTCCCCCCTCTTGACGCCG 

4 34 5 APAI, 

TyrArgArgCysArgAlaSerGlyValLeuThrThrSerCysGlyAsnThrLeuThrCys 
4 382 TATCGCAGGTGCCGCGCGAGCGGCGTACTGACAACTAGCTGTGGTAACACCCTCACTTGC 
ATAGCGTCCACGGCGCGCTCGCCGCATGACTGTTGATCGACACCATTGTGGGAGTGAACG 

T yr 1 1 e Ly s Al aAr gAlaAlaCy s Ar gAlaAl aGl y LeuGlnAspCy s ThrMe t LeuVa 1 
4 4 4 2 TACATCAAGGCCCGGGCAGCCTGTCGAGCCGCAGGGCTCCAGGACTGCACCATGCTCGTG 
ATGTAGTTCCGGGCCCGTCGGACAGCTCGGCGTCCCGAGGTCCTGACGTGGTACGAGCAC 

4 452 SMAI XMAI, 

CysGlyAspAspLeuValVallleCysGluSerAlaGlyValGlnGluAspAlaAlaSer 
4 502 TGTGGCGACGACTTAGTCGTTATCTGTGAAAGCGCGGGGGTCCAGGAGGACGCGGCGAGC 
ACACCGCTGCTGAATCAGCAATAGACACTTTCGCGCCCCCAGGTCCTCCTGCGCCGCTCG 

4508 DRD1 , 4511 TTH3I , 

LeuArgAlaPheThrGluAlaMetThrArgTyrSerAlaProProGlyAspProProGln 
4 562 CTGAGAGCCTTCACGGAGGCTATGACCAGGTACTCCGCCCCCCCTGGGGACCCCCCACAA 
GACTCTCGGAAGTGCCTCCGATACTGGTCCATGAGGCGGGGGGGACCCCTGGGGGGTGTT 

ProGluTyrAspLeuGluLeuIleThrSerCysSerSerAsnValSerValAlaHisAsp 
4 622 CCAGAATACGACTTGGAGCTCATAACATCATGCTCCTCCAACGTGTCAGTCGCCCACGAC 
GGTCTTATGCTGAACCTCGAGTATTGTAGTACGAGGAGGTTGCACAGTCAGCGGGTGCTG 

4 637 SACI, 

GlyAlaGlyLysArgValTyrTyrLeuThrArgAspProThrThrProLeuAlaArgAla 
4 682 GGCGCTGGAAAGAGGGTCTACTACCTCACCCGTGACCCTACAACCCCCCTCGCGAGAGCT 
CCGCGACCTTTCTCCCAGATGATGGAGTGGGCACTGGGATGTTGGGGGGAGCGCTCTCGA 

4731 NRUI, 

AlaTrpGluThrAlaArgHisThrProValAsnSerTrpLeuGlyAsnllelleMetPhe 
4742 GCGTGGGAGACAGCAAGACACACTCCAGTCAATTCCTGGCTAGGCAACATAATCATGTTT 
CGCACCCTCTGTCGTTCTGTGTGAGGTCAGTTAAGGACCGATCCGTTGTATTAGTACAAA 

AlaProThrLeuTsrpAlaArgMetll LeuMetThrHisPhePheSerValLeuIleAla 
4802 GCCCCCACACTGTGGGCGAGGATGATACTGATGACCCATTTCTTTAGCGTCCTTATAGCC 
CGGGGGTGTGACACCCGCTCCTACTATGACTACTGGGTAAAGAAATCGCAGGAATATCGG 

4806 PFLM1, 4807 DRA3, 

ArgAspGlnLeuGluGlnAlaL uAspCysGluIleTyrGlyAlaCysTyrSerlleGlu 
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4 862 AGGGACCAGCTTGAACAGGCCCTCGATTGCGAGATCTACGGGGCCTGCTACTCCATAGAA 
TCCCTGGTCGAACTTGTCCGGGAGCTAACGCTCTAGATGCCCCGGACGATGAGGTATCTT 

€893 BGL2, 

ProLeuAspLeuProProIlelleGlnArgLeuHisGlyLeuSerAlaPheSerLeuHis 
4 922 CCACTGGATCTACCTCCAATCATTCAAAGACTCCATGGCCTCAGCGCATTTTCACTCCAC 
GGTGACCTAGATGGAGGTTAGTAAGTTTCTGAGGTACCGGAGTCGCGTAAAAGTGAGGTG 

4 954 NCOI, 

SerTyrSerProGlyGluIleAsnArgValAlaAlaCysLeuArgLysLeuGlyValPro 
4 982 AGTTACTCTCCAGGTGAAATCAATAGGGTGGCCGCATGCCTCAGAAAACTTGGGGTACCG 
TCAATGAGAGGTCCACTTTAGTTATCCCACCGGCGTACGGAGTCTTTTGAACCCCATGGC 

5015 SPHI, 5035 KPNI, 

ProLeuArgAlaTrpArgHisArgAlaArgSerValArgAlaArgLeuLeuAlaArgGly 
504 2 CCCTTGCGAGCTTGGAGACACCGGGCCCGGAGCGTCCGCGCTAGGCTTCTGGCCAGAGGA 
GGGAACGCTCGAACCTCTGTGGCCGGGGCCTCGCAGGCGCGA7CCGAAGACCGGTCTCCT 

5064 APAI, 5091 BALI, 

GlyArgAlaAlalleCysGlyLysTyrLeuPheAsnTrpAlaValArgThrLysLeuLys 
5102 GGCAGGGCTGCCATATGTGGCAAGTACCTCTTCAACTGGGCAGTAAGAACAAAGCTCAAA 
CCGTCCCGACGGTATACACCGTTCATGGAGAAGTTGACCCGTCATTCTTGTTTCGAGTTT 

5113 NDEI, 

LeuThrProIleAlaAlaAlaGlyGlnLeuAspLeuSerGlyTrpPheThrAlaGlyTyr 
5162 CTCACTCCAATAGCGGCCGCTGGCCAGCTGGACTTGTCCGGCTGGTTCACGGCTGGCTAC 
GAGTGAGGTTATCGCCGGCGACCGGTCGACCTGAACAGGCCGACCAAGTGCCGACCGATG 

5174 NOTI, 5175 EAG1 XMA3, 5182 BALI, 5186 PVU2, 

SerGlyGlyAspIleTyrHisSerValSerHisAlaArgProArgTrpIleTrpPheCys 
5222 AGCGGGGGAGACATTTATCACAGCGTGTCTCATGCCCGGCCCCGCTGGATCTGGTTTTGC 
TCGCCCCCTCTGTAAATAGTGTCGCACAGAGTACGGGCCGGGGCGACCTAGACCAAAACG 

5240 DRA3, 

LeuLeuLeiiLeuAlaAlaGlyValGlylleTyrLeuLeuProAsnArgMetSerThrAsn 
5282 CTACTCCTGCTTGCTGCAGGGGTAGGCATCTACCTCCTCCCCAACCGAATGAGCACGAAT 
GATGAGGACGAACGACGTCCCCATCCGTAGATGGAGGAGGGGTTGGCTTACTCGTGCTTA 

5295 PSTI, 

PfoLys ProGlnArgLysThr LysArgAsnThrAsnArgArgProGlnAspValLys Phe 
5342 CCTAAACCTCAAAGAAAGACCAAACGTAACACCAACCGGCGGCCGCAGGACGTCAAGTTC 
GGATTTGGAGTTTCTTTCTGGTTTGCATTGTGGTTGGCCGCCGGCGTCCTGCAGTTCAAG 

5380 NOTI, 5381 EAG1 XMA3, 5390 AAT2, 5401 SMAI XMAI, 

ProGlyGlyGlyGlnlleValGlyGlyValTyrLeuLeuProArgArgGlyProArgLeu 
5402 CCGGGTGGCGGTCAGATCGTTGGTGGAGTTTACTTGTTGCCGCGCAGGGGCCCTAGATTG 
GGCCCACCGCCAGTCTAGCAACCACCTCAAATGAACAACGGCGCGTCCCCGGGATCTAAC 
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544 9 APAI, 

GlyValArgAlaThrArgLysThrSerGluArgSerGlnProArgGlyArgArgGlnPro 
5462 GGTGTGCGCGCGACGAGAAAGACTTCCGAGCGGTCGCAACCTCGAGGTAGACGTCAGCCT 
CCACACGCGCGCTGCTCTTTCTGAAGGCTCGCCAGCGTTGGAGCTCCATCTGCAGTCGGA 

5467 BSSH2, 5478 XMNI, 5502 XHOI, 5511 AAT2, 

IleProLysAlaArgArgProGluGlyArgThrTrpAlaGlnProGlyTyrProTrpPro 
5522 ATCCCCAAGGCTCGTCGGCCCGAGGGCAGGACCTGGGCTCAGCCCGGGTACCCTTGGCCC 
TAGGGGTTCCGAGCAGCCGGGCTCCCGTCCTGGACCCGAGTCGGGCCCATGGGAACCGGG 

5548 ALWN1, 5558 ESP1, 5564 SMAI XMAI, 5568 KPNI, ( 

LeuTyrGlyAsnGluGlyCysGlyTrpAlaGlyTrpLeuLeuSerProArgGlySerArg 
5582 CTCTATGGCAATGAGGGCTGCGGGTGGGCGGGATGGCTCCTGTCTCCCCGTGGCTCTCGG 
GAGATACCGTTACTCCCGACGCCCACCCGCCCTACCGAGGACAGAGGGGCACCGAGAGCC 

ProSerTrpGlyProThrAspProArgArgArgSerArgAsnLeuGlyLysVallleAsp 
5642 CCTAGCTGGGGCCCCACAGACCCCCGGCGTAGGTCGCGCAATTTGGGTAAGGTCATCGAT 
GGATCGACCCCGGGGTGTCTGGGGGCCGCATCCAGCGCGTTAAACCCATTCCAGTAGCTA 

5650 APAI, 5696 CLAI, 

ThrLeuThrCysGlyPheAlaAspLeuMetGlyTyrlleProLeuValGlyAlaProLeu 
5702 ACCCTTACGTGCGGCTTCGCCGACCTCATGGGGTACATACCGCTCGTCGGCGCCCCTCTT 
TGGGAATGCACGCCGAAGCGGCTGGAGTACCCCATGTATGGCGAGCAGCCGCGGGGAGAA 

5724 HGIE2, 5750 KAS1 NARI, 5756 ECON1, 

GlyGlyAlaAlaArgAlaOC AM 
5762 GGAGGCGCTGCCAGGGCCTAATAGTCGAC 
CCTCCGCGACGGTCCCGGATTATCAGCTG 

5785 SALI, 
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